THE ADJOXO EXPERIMENTS -- COMPARING FREJA, HOOD AND REDEX TRAILS
ART Memo 8 (Version 0)
Colin Runciman, 14 July 2000

INTRODUCTION

When Henrik Nilsson and Jan Sparud visited us recently, one of our
activities was to try out each of Freja, Hood and Redex Trails on the
same example of a deliberately broken program.

Freja is Henrik's compiler that builds Evaluation Dependence Trees to
support so-called algorithmic debugging. Hood is Andy Gill's observation
debugging system.

This memo is a record of a few notes I made, highlighting a few of the
strengths and weaknesses of each system that became apparent in this
particular exercise.  (In a later version of the memo I may add more
by way of reflection and wider discussion. It would also be good
to repeat the exercise using a different example program.)

THE ADJOXO PROGRAM

We used a program of about 100 lines: an adjudicator for noughts and
crosses positions. For example, given the input:

 O | X |  
---+---+---
   |   |  
---+---+---
   |   | 

The Adjoxo program's output should be:

if X to play: DRAW
if O to play: WIN for O

The source of the (broken) program follows.  It has been broken in three
places: line 27; lines 43&44 and lines 86&91.

     1  -- Adjudicator for noughts-and-crosses positions
     2  -- See *.in for example inputs.
     3  -- Colin Runciman

     5  data GameValue = Loss | Draw | Win deriving (Eq,Ord,Show)

     7  bestOf :: GameValue -> GameValue -> GameValue 
     8  bestOf Win  v   = Win
     9  bestOf Loss v   = v
    10  bestOf Draw Win = Win
    11  bestOf Draw v   = Draw

    13  inverse :: GameValue -> GameValue
    14  inverse Loss = Win
    15  inverse Draw = Draw
    16  inverse Win  = Loss

    18  -- Positions in the playing grid are numbered 1..9 with
    19  -- top row 1,2,3; middle row 4,5,6; bottom row 7,8,9.
    20  -- Ordered lists of such numbers represent O's or X's.

    22  type Region = [Int]
    23  type Player = Region

    25  insert :: Int -> Region -> Region
    26  insert x []                    = [x]
    27  insert x xs@(y:ys) | x > y     = insert x ys
    28                     | otherwise = x : xs

    30  dif :: Region -> Region -> Region
    31  dif [] ys = []
    32  dif xs [] = xs
    33  dif xs@(x:xs') ys@(y:ys') =
    34    case compare x y of
    35    LT -> x : dif xs' ys
    36    EQ ->     dif xs' ys'
    37    GT ->     dif xs  ys'

    39  subset :: Region -> Region -> Bool
    40  subset xs ys = null (dif xs ys)

    42  hasLine :: Player -> Bool
    43  hasLine p = subset [1,2,3] p || subset [4,5,6] p || subset [7,8,9] p &&
    44              subset [1,4,7] p || subset [2,5,8] p || subset [3,6,9] p &&
    45                        subset [1,5,9] p || subset [3,5,7] p

    47  gridFull :: Player -> Player -> Bool
    48  gridFull ap pp = length ap + length pp == 9

    50  -- The `ap' argument of `analysis' is a position list for the active player
    51  -- (to move next); and the `pp' argument is the list for the passive player.
    52  -- The result represents the outcome with best play on both sides.

    54  analysis :: Player -> Player -> GameValue
    55  analysis ap pp =
    56    if      hasLine pp then Loss
    57    else if gridFull ap pp then Draw
    58    else    foldr1 bestOf (map moveval (([1..9] `dif` ap) `dif` pp))
    59    where
    60    moveval m = inverse (analysis pp (insert m ap))

    62  -- The argument to `parsed' is an input string, and its result
    63  -- is the corresponding pair of position-lists for O's and X's.
    64  -- Example:  O | O | 
    65  --          ---+---+---
    66  --             |   |      ==> ([1,2],[7,8])
    67  --          ---+---+---
    68  --           X | X | 

    70  parsed input = (region 'O' input, region 'X' input)
    71    where
    72    region = region' 1
    73    region' _ _ [] = []
    74    region' n p (c:cs) =
    75      if c=='|' then region' (n+1) p cs
    76      else if c=='-' then region' (n+1) p (dropWhile (`elem` "-+") cs)
    77      else if c==p then n : region' n p cs
    78      else region' n p cs

    80  -- The argument to `adjudicate' represents positions of O's and X's.
    81  -- Its result is a short adjudicator's report.

    83  adjudicate :: (Player, Player) -> String
    84  adjudicate (os,xs) =
    85    case compare (length os) (length xs) of
    86    LT -> report (analysis xs os) 'X'
    87    EQ -> if hasLine xs then report Win 'X'
    88          else if hasLine os then report Win 'O'
    89          else "if X to play: " ++ report (analysis xs os) 'X' ++
    90               "if O to play: " ++ report (analysis os xs) 'O'
    91    GT -> report (analysis os xs) 'O'

    93  report Loss p = report Win (opp p) 
    94  report Win  p = "WIN for " ++ [p] ++ "\n"
    95  report Draw _ = "DRAW\n"

    97  opp 'O' = 'X'
    98  opp 'X' = 'O'

   100  main =
   101    do
   102       input <- getContents
   103       putStr (adjudicate (parsed input))
   
ADJOXO AND FREJA

+ Strictification and the rule collapsing unevaluated expressions to `?'
  worked well for this example.  Almost all the equational queries were
  very readable, and most were one-liners.
  
+ Steady progress towards the bugs: it took 25 yes/no questions to locate
  the first fault (ie to identify a specific equation in the program
  that must be wrong) and another 25 to reach the second.
  
+ Including free variable bindings in queries (eg. bindings for `ap' and
  `pp' in connection with queries about applications of `moveval') made
  things very clear.
  
- It was hard to answer a few questions, with expressions full of undefined
  and ? symbols.
  
- No memoisation of answers, so same question could be repeated; and no way
  to generalise answers, so could have a series of very similar questions
  all requiring the same answer.
  
- It would have been useful to assert on-the-fly that a certain function
  (eg. hasLine) was trusted, avoiding many questions about it.

- When a specific component in a result was wrong (eg. the last element in
  a list) there was no way to say so.  One had to try and steer the system
  towards the offending value by a series of artificial answers about other
  components and expressions.  The `where' command gives some support for
  this, by showing the ancestral chain of the current question in the EDT.

ADJOXO AND HOOD

+ The way that functions are included among observable values generally
  served us well for this example, and most observations we made were of
  functions.

+ The replay facility, showing the order in which components of a data
  structure were demanded by the context was particularly instructive.

+ The presentation of observations using pretty-printing and browser
  tools worked nicely (though as the browser must load the whole
  observation file, it could not handle our largest observation files).
  
- The program had to be modified.  The initial choice of what to observe
  seemed arbitrary -- and to specify the observations we wanted it was
  often necessary to modify the program beyond mere annotation: as a simple
  example, we redefined analysis = observe "analysis" analysis' and renamed
  analysis to analysis'.

- We had to add </observe> by hand to the end of observation logs for
  an interrupted non-terminating program.

- Without some notion of trusted functions, observations of higher
  order functions (eg. foldr1) can be very unwieldy.
  
- Observations of locally defined functions can be misleading. One is really
  seeing an enumeration of results for a family of different functions, with
  different values for free variables. Eg. in the same observation of the
  moveval function, `moveval 8 => Draw' and `moveval 8 => Win' can appear.

ADJOXO AND REDEX TRAILS

* Each of the three faults could be traced along a very short trail
  from the starting point of an interruption or incorrect output.
  
* Comparatively full information was available -- for example, it was often
  appropriate to select a component of a trace expression already on display,
  such as the truth value in a conditional.

* Each new expression in the trace display provided a source of relevant clues
  such as sharing information and source links.
  
- At times there was too much information in the displayed trail, and too
  much choice of what to examine next, so the less experienced users could
  easily get lost examining an irrelevant region of the redex trail.

- Although the inclusion of unevaluated expressions in the display (eg. for
  elements of a list) sometimes provided helpful context, they could also
  obscure higher level properties (eg. the length of the list).

- The larger traced computations were very slow -- more than ten times slower
  than their untraced counterparts.
  
- The colour coding in the trail browser is a mixed blessing: expressions
  flashing between colours at every change in the mouse selection can
  be very distracting.