TRIP REPORT: IFIP WG 2.8, 23-27 APRIL 2001, OARE, SWEDEN
ART Memo 21 (Version 0, 3 May 2001)
Colin Runciman

INTRODUCTION

WG 2.8 is the IFIP working group on Functional Programming.  It meets at
intervals of less than a year.  Participation is by invitation, but there
are elected members who are always invited.  There are no submitted papers
and no published proceedings -- the schedule is determined on-the-fly.
There are talks, but with a high degree of interaction, and lots of
out-of-session discussions.  Some talks take the form of suggested
exercises.

NOTES FROM THE TALKS

MONDAY 23/4/01

These Monday notes tend to be rather brief because I was tired after
travelling overnight -- and the sleeper train from Stockholm was delayed
so I wasn't there for the start of the meeting.

Eric Meijer: compiling Mondrian to Java
---------------------------------------

I missed this talk but I got the gist of it from Eric later in the week.
Motivation: functional scripting.  Problems: functional results and
lazy evaluation.

Olivier Danvy: a programming exercise
-------------------------------------

Given lists xs, ys of equal length n compute zip xs (reverse ys)
using only n conses.

Similar problem: given a list of length 2n or 2n+1, check whether
it is a palindrome; do this in n recursive calls, 2n head operations,
2n tail operations (or less if it is not a palindrome) and no heap
allocation.

A variant: as above but test whether list is of the form w++w.

Another problem: split a list of integers into a list of lists dividing
at occurrences of zero and using cons *only* to construct the result.

...

Later in the week Olivier gave a couple of solutions to the first problem:

Solution A: use a higher-order accumulator -- trading cons allocation
for closure allocation:

zr (xs,ys) = 
  w xs (\(r,y:ys) -> a ((x,y):r,ys))
  where
  w ...

Solution B: instead of CPS use direct style:

zr (xs,ys) =
  r
  where
  (r,[]) = w xs
  w ...

Ralf Hinze: programming challenge
---------------------------------

Make an efficient (nlogn) version of:

packFirstFit :: [Item] -> [Bin]
packFirstFit = foldl firstFit []

firstFit :: [Bin] -> Item -> [Bin]
firstFit [] i = [i]
firstFit (b:bs) i =
  | b+i <= 1 = b+i : bs
  | otherwise = b : firstFit bs i

...

Later in the week Ralf described his solution -- `A simple implementation
technique for priority search queues'. His purposes were (a) to advertise
priority search queues (b) to describe a new implementation technique, and
(c) to promote views.

Priority search queue = finite map + efficient access to binding with
minimum value.  Example application: single-source shortest path problem
-- a functional implementation of Dijkstra's algorithm.

Implementation of priority search queues using semi-heaps and `priority
search pennants'.

For details see Ralf's web page:
http://www.cs.uu.nl/~ralf/publications.html

Still later in the week, Ralf and I used hat to try and find a bug in
his bin-packing implementation -- separate comments about that towards the
end of this memo.

Fritz Henglein: value-oriented programming
------------------------------------------

Value-oriented programming is a re-badging of functional programming
for O-O programmers working with internet applications.  Values are
just immutable objects.  References are made location-independent
by maintaining indirection tables with content hashing.  Programs are
also values with the slogan `Software is Data'.

Martin Odersky: types for objects and modules
---------------------------------------------

Languages such as Moby (sp?) and OCaml unify objects and modules but their
type systems have problems.  The talk was about the type system for Funnel
(language report and compiler in preparation) -- another language with
unified objects and modules (but a much restricted type system in its
preliminary version)?


TUESDAY 24/4/01

Simon Peyton Jones: Object-oriented style overloading for Haskell
-----------------------------------------------------------------

Motivation -- access to libraries in other languages.  How can OO classes
+ methods be represented in Haskell types and signatures? The problem is
that OO libraries make heavy use of subtyping and ad hoc overloading. Bad
solutions include explicit coercion functions and name mangling. Instead
Simon proposes to introduce subtypings into type constraints. (It would
be possible to do quite a bit using `ordinary' multi-parameter classes,
but some things are tricky -- eg. transitivity of subtyping.) There
is no implicit subtyping, only the explicit coerce :: (a<:b) => a->b.
Class constraints can also be used to represent overloading,
but the constraint-solving problem is hard.

Stephanie Wierich:
  Polytypic programming and intensional type constructor analysis
-----------------------------------------------------------------

Stephanie first reviewed polytypic programming in Ralf H's style.
The same definition can be parametrised not only by type but also by
type constructor.  Previously, the semantics of polytypic definitions
has been given by a translation that specialises away polytypism.
Regarding polytypic application as computation over type constructors,
what direct operational semantics can be given to polytypic definitions?
The semantics given in the talk actually considers computations over
*terms representing* types: this distinction not only reflects current
implementations, it also seems necessary to handle substitutions for
type variables.

Thomas Arts: Development of formally verified Erlang programs
-------------------------------------------------------------

Motivation -- software correctness. Two methods: EVT prover for Erlang
(not this talk) and model checking (this talk). The novelty is modelling
the design patterns of Erlang. Example: client-server communication
for exclusive access to resource with asynchronous communication via
unbounded message streams. Thoams described several pragmatic tricks for
reducing infinite cases to finite ones, and programming patterns that
replace asynchronous communication by synchronous calls combining send
and receive. There are transformations to reduce programs to a core
form. Transition systems can be generated for quite advanced Erlang
programs, and model-checking performance is good.

Koen C: Veri-Quick-Check
------------------------

Koen first reminded us about QuickCheck, with its combinators to define
pseudo-random generators for test data.  The generators use trees
of random numbers, not just sequences, to avoid conflict with lazy
evaluation.  There are limiting assumptions: all *generated* data structures
are finite and all *generated* functions are strict.  An example:

prop_SortsStable = forAll orderedList $ \xs -> sort xs == xs

Here orderedList is a generator of type Gen [Int :-> Int].

A similar scheme has been used to codify properties in Lava -- a verification
system for hardware circuits. Even though the use of propositions is
very different, the implementation of `verify' in Lava is very similar to
the implementation of the quickCheck testing function.  Trees of random
numbers are now used to generate fresh propositional variables. The
resulting propositions are passed to an established propositional prover
-- Stalmark's. The main limitation of Lava is that all the properties
must be encodable in propositional logic.

Koen has also tried combining the power of the two systems by letting
each generator have the choice of generating pseudo-random values or of
generating a new propositional variable.  No clear cut results yet.

Karl-Filip Faxen:
  The costs and benefits of cloning in a lazy functional language
-----------------------------------------------------------------

Karl gave the same talk as at the Cambridge meeting.

Cloning and inlining; deep and shallow cloning; sequencing of compiler
passes; optimising thunk/eval; call-site dependence; cost of cloning
with and without sharing; reducing cache misses by procedure placement.


WEDNESDAY 25/04/01
 
Didier Remy: a variant of the join calculus
-------------------------------------------

Despite the name, the `join calculus' is not really a calculus for
programming, but rather a programming system with a `soup' of
concurrent objects and `reactions' between objects -- reminiscent
of Gamma, the Chemical Abstract Machine etc.

The motivating context is concurrent distributed programming. There may
be concurrency within objects, not just concurrent sequential objects.
The join calculus uses `messages' (cf. Prolog terms or atoms) as
aniform representation of methods, state, synchronisation. For example,
a one-place buffer:

obj b =
  self (z)
     Get(r)   & SOME(a) > r.Return(a) & z.EMPTY()
  or Put(a,r) & EMPTY() > r.Return()  & z.SOME(a)
in
  ... the soup ...

The explicit synchronisation is actually unnecessary:

self(z) Put(a) & Get(r) > r.Return(a)

The main new ingredient that Didier presented was a new operation of
`selective refinement' that can be used to express multiple inheritance.
Selective refinement matching parts of LHS in rules, replacing those
parts by variants binding at least the same variables and *extending*
RHS with new parts. For example

class lockable_buffer =
  self(z)
    match buffer with
      Put(a,r) => Put(a,r) & FREE() > z.FREE()
    end
  or Suspend(r) & FREE()   > r.Reply() & z.LOCKED()
  or Resume(r)  & LOCKED() > r.Reply() & z.FREE() 

It is important to realise that there is no guarantee of computational
refinement, only refinement at the level of types and signatures -- just as
in Haskell classes.

There is an implementation in JoCaml -- http://pauillac.inria.fr/jocaml/

Manuel Chakravarty: a high-performance array implementation
-----------------------------------------------------------

Motivation: support applications in scientific computing.  There are many
problems with standard Haskell arrays: boxing, irregular arrays not well
supported, construction from association lists, etc.

Manuel described a new data type for *parallel* arrays.  Controversial
semantics: touch one element and all elements are evaluated.  A flattening
transformation changes representation of arrays with structured elements
to trees with arrays of unboxed basic values at the leaves.

So far so good, but operations on the flattened unboxed arrays are very
fine-grained.  So the next step is to introduce array fusion by equational transformation,
implemented by ghc rewrite rules.

Implementation status: most of PreludeList replicated for PArrays +
other folds/scans; basic fusion works; still working on transformation
rules and optimisations.  Example performance: 50% overhead wrt C for
sparse-matrix multiplication.

Joe Armstrong:
  turning ideas into businesses -- Erlang-based developments
------------------------------------------------------------

Erlang now has bit-string pattern-matching, scripting, gtk interfaces,
and tools like emacs are being implemented in Erlang.

In 1998 Eriksson banned further use of Erlang! But it also agreed to
make Erlang open-source, whereupon the development team left to form
Bluetail.  Out of 75 initial ideas for a product, they selected one --
the e-mail robustifier (target is e-mail server that fails only once in
10 years); in 1999 Bluetail ran always 3 weeks from bankruptcy until the
robustifier sold to major e-mail provider. Now SEK 27M venture capital
was injected.  In 2000 Bluetail were contacted by Alteon at trade fair --
fulfilled a contract with them by an Erlang implementation in 2 months,
leading to SEK 45M more capital, and soon after that the purchase of
Bluetail outright for SEK 1400M (a lot of money!).  In 2001 Bluetail
became a division of Alteon, with a new product -- a robust platform for
distributed systems, allowing operations such as upgrading of software
without taking machines down.

As usual in Joe's talks, there were lots of anecdotes.


THURSDAY 26/4/01

Francois Pottier: Hierarchical row types
----------------------------------------

A row is a finite representation of an infinite series of bindings of
types to labels.  For example, a row type indicating the presence of int
fields a and b, but the absence of any other fields is
{a: Pre int; b: Pre int; *: Abs}.

Idea: introduce a partial order on labels: eg. {a:*(Pre int); -} is a
row where all labels less than a are present as int fields.  Now rows are
finite representations of infinite *trees*. Another eg: {a:{b:t1;t2};t3}
will be well-formed if b<a.

There are established rules for working with ordinary row types --
in particular row unification.  Conjecture: unification of well-sorted
hierarchical rows is terminating and sound wrt intended tree model.

Possible applications include typing of programs with hierarchical
exception handling -- to infer which exceptions a function may raise
or propagate.

Olivier Danvy: Defunctionalisation at work
------------------------------------------

The topic for this talk was representing higher-order functions by closure
conversion -- encoding lambda expressions as data constructions with
values of free variables as fields. (Reynolds `Definitional Interpreters'
1972 -- republished in HOSC 1998) Firstification tools have applications
in compilers, partial evaluators, etc.

Olivier's purpose was to identify patterns across different example
uses of defunctionalisation in programs, and in proofs about programs.
Examples he looked at included Church encoding / functional data
structures; evaluation contexts; interpreters for backtracking
languages -- structures to replace continuation-based definitions;
continuation-based program transformation.

Exercise: look at Harper's `Proof-directed Debugging' proof of a
reg.exp. matcher.  Compare proofs before and after defunctionalisation.

Satnam Singh: Compiling Lava sorters to FPGA
---------------------------------------------

Satnam is still working for Xilinx, designing custom FPGAs.

He first reviewed conventional Xilinx development of something like a
sorter, using VHDL.  Can define basics of 2-sorter, riffles, unriffles
etc, but cannot define higher-order functions.  The floor-plan is
irregular and extensive.  Most of FPGA area is occupied by masses of
`wiring' linking `logic blocks' encoded in much smaller patches.

Compare with use of (Xilinx) Lava. Definitions are shorter because of
higher-order functions.  Floor-plans regular and more compact.

Joergen Gustavsson:
  Space-safety of program transformations in call-by-need languages
-------------------------------------------------------------------

First problem is to define space complexity for call-by-need -- as
observed complexity differs between implementations.  Model based on
Sestoft's abstract machine with heap and stack counted separately.

Definition of weak improvement between program fragments M,N:
C[M] | (h,s) => C[N] | (f h, f s), where `| (h,s)' means `gives a
terminating computation needing no more than h units of heap and s units
of stack'.  Various improvement laws for different kinds of expressions.
In general if M is an improvement of N then freevars M must be a superset
of freevars N.

Strong improvement requires exact bound -- the function f above must be
the identity.  Too sharp for most purposes, but with the insertion of
space-gadgets (eg. `stack spikes', `heap ballast') can often show weak
improvement by strong improvements of variants including space-gadgets.

Can show space equivalence of examples such as (any p <-> or . map p).

Re-associating ++ is not safe if safety conditions must hold for each
of heap and stack separately.

Paul Hudak: FRP for Real -- Real-time functional reactive programming
---------------------------------------------------------------------

Fran: An animation example to review behaviours, events and FRP:

growFlower = stretch size flower where size = 1 + integral bSign

bSign = 0 `until` (lbp ==> -1 `until` lbr ==> bSign) .|.
                  (rbp ==>  1 `until` rbr ==> bSign)

Applying these ideas to robotics, Frob = FRP + robot controller +
robot/vision library. Frob presents several problems of real-time control,
exceptions etc that do not occur in Fran.

Challenge: automous co-ordinated motion -- try to emulate natural
behaviour of fish/animals in schools/herds.  Eg. a pair of creatures
circling each other as they move forward together.

Also FranTk and FrVision.

A lot of work done, but formal semantics still not clear.  Paper at
PLDI'00 showed that discrete operational semantics converged in the
limit to a continuous denotational semantics.

Implementation issues: how to guarantee time/space; could FRP be used for
embedded systems? what is the operational semantics?  Goal: Real-Time FRP
-- a restricted subset of FRP with guarantees.  Notation and type-system
for RT-FRP presented.  Two prototypical recursive integrals are directly
expressible.  An operational semantics based on transitions: for a given
input i at a given time t, a term s evaluates to v on transition to s'.

Key results:
* type safety -- no core dumps!
* each step constant time -- no time leaks!
* term-size cannot grow -- no space leaks!
* constrained form of recursion guarantees progress -- no deadlock!

Colin Runciman: Hat Tricks -- tracing Haskell programs
------------------------------------------------------

I began with the questions (a) What is a trace? and (b) How is a trace
produced and used?  I reviewed the answers to these questions in Hood,
Freja and Redex Trails, using insertion sort as an example, and pointing
out strengths and weaknesses. I explained that our aim in Hat was to
generate `universal traces' that could incorporate all three of the
previous models.

I gave a demonstration based on Adjoxo, using hat-check -a, hat-observe,
hat-detect and hat-stack.  (I didn't use hat-trail because graphics were
broken on the laptop.)

Among remaining problems I highlighted the representation of
function-level and monadic traces, speed of trace generation and
integration of tools.  I also invited people to collaborate in 
(a) porting to other implementations and (b) user evaluation.

(I had quite a bit of feedback -- see later.)


FRIDAY 27/4/01

Lennart Augustsson: BlueSpec
----------------------------

Lennart now works for Sandburst, a new company set up by Arvind, and
working on the design and construction of specialised hardware such as
10Gbit routers for the internet.

BlueSpec is an innovative hardware-description language designed and
implemented by Lennart.  It is a 2-level language: that is, computation is
staged. A Haskell equivalent is available at *compile-time*.  All function
applications occur at compile time: achieving specialisation.  At run-time
the computational model is based on Arvind's non-deterministic rewriting
systems.  Generated code can be expressed, for example, in Verilog:
BlueSpec is in effect a (highly sophisticated) Verilog preprocessor.

mkGCD :: module Empty
mkGCD =
  module
    x :: Reg INt
    x <- mkReg initX
    y :: Reg Int
    y <- mkReg initY
  rules
    when x > y, y /= 0
      ==> { x:=y; y:=x }
    when x <= y, y /= 0
      ==> { y := y-x }
  interface
    { }      -- i/face to the outside world, here Empty

This is a *generator* for a hardware GCD -- computing by non-deterministic
rewriting.  The key to good performance is to apply many rewrite rules in
parallel -- but the semantics is guaranteed to correspond to some sequential
ordering of the rewrites -- there is an analysis that determines which
rules can be selected to ensure this.

There are no actions within BlueSpec they are all `outside' -- for
example, a register.

interface Reg a =
  get :: a
  set :: a -> Action

mkReg :: (Bits a sa) => a -> Module (Reg a)

Lennart described some of the implementation tricks (others are secret!)
such as generic tabulation, and an encoding of numbers to give restricted
dependent types.  The number types have a distinguished kind #, and
Bit :: # -> *, so Bit 32 is a type.  Functional dependencies between
types make it possible to define functions over fixed-length bit-strings.
For example, with:

class Add i j k | i j -> k, j k -> i, k i -> j

bitAppend :: (Add i j k) => Bit i -> Bit j -> Bit k

BlueSpec also has an expression _ (underscore) for occasions where the
value does not matter -- for example, it might be an irrelevant bit
in a fixed-length bit string.  It turns out that underscores are very
important for the generation of efficient hardware.

Fritz Henglein: Imperative region-based memory management
---------------------------------------------------------

Dynamic memory as a linked list of regions.  Good data cache behaviour,
fast deallocation, real-time memory management amenable to analysis,
semantic garbage deallocation.

Context/comparison with Tofte-Talpin'94. Eg game of life

nextgen [R] g =
  read g from R
  create and return next generation in R

life [Rn,Rg] (n,g) =
  if n=0 then g
  else letregion Rn' in
         life[Rn',Rg] (n-1 @ Rn', nextgen[Rg] g)
       end

ML kit with regions overcomes some the problems with TT94:
eg. regains tail-recursion and has storage-mode analysis
allowing reallocation from base of existing region.

Aiken-Fahndrich-Levien achieved better region use by moving
allocation/deallocation closer to first/last uses, reducing
lag and drag.

Walker-Crary-Morrisett 2000: calculus of capabilities, source
language is CPS programs, all calls have explicit region passing.

In FH's system regions are reference-counted, with operations:
new R -- bind new region to R
release R -- set R to null
R := alias R' assign contents of R' to R
R := R' equiv to R := alias R'; release R'

Should be able to emulate ML Kit, AFL.  Prototype system for experiments.
Evolving implementations in Standard ML and in Java.

Josef Svenningsson: Constraint abstractions
-------------------------------------------

Motivating problem: how to make complex type systems, with constraints
such as subtyping, `scale up'.

Idea: don't copy constraints, use constraint abstractions.

Result: flow analysis with flow-polymorphic recursion in O(n^3).

John Hughes: cache-aware functional programming
-----------------------------------------------

Motivation: John heard of a propositional prover (written in C) that had
been sped up by a factor of ten by reprogramming to be cache-friendly.
Cache-workings are m/c dependent but only in details: improving locality
of memory reference is a win. For example, if a program maintains a tree
as one of its main data structures, and computation makes heavy use of
lookup in this tree, there could be cache-gains from storing the root and
its near decendents all together in memory. Another example: order fields
in records so that fields accessed at the same time are stored adjacently.

Small case study: balanced tree of all alpha strings of length three,
then look them all up in reverse alpha order.  Tried reordering
key and subtree references -- but little difference.  Tried changing
to a trie representation, motivated by repeated access to different
copies of same char.  Now most comparisons will fail at first character:
so unpack the first character into the node itself -- a further 20%
improvement, but is it just algorithmic or is it to do with cache?
Next step is to unbox things: about another 40% improvement. Again
tried reordering fields but no improvement -- thwarted by a compiler
`optimisation' reordering fields to put pointers first!

Work still `in progress'.  John wishes he had a cache profiler.
Apparently there is one for GHC -- ask Julian S.

FEEDBACK AND DISCUSSIONS ON TRACING

* Ralf Hinze had an immediate interest in tracing a fault in his
  implementation of priority search queues for the bin-packing problem.
  We made a start on this at the meeting, mainly using hat-detect,
  and were able to narrow the problem down to deletion in a red-black
  pennant tree, with a specific counter-example.  (Since returning
  home Ralf has mailed to say the bug is now found and fixed.)
  The main problems we encountered (in order of importance) were:
  
  1 lack of pretty-printing: looking at  (L (a) (r) (g) (e))
    constructor terms -- most wrapping over several screen lines --
    gets quite tiring after an hour or so!
  
  2 numbers other than ints not properly presented
  
  3 partial applications can prevent accurate tracing (as can lambda
    expressions)
  
  4 other problems that have since been fixed
  
  Despite these problems Ralf thought Hat already looked useful. He is
  keen to have a version as soon as possible and would be a good candidate
  for a beta-test before the next release.

* John O'Donnell has a long-standing interest in the tracing problem.
  He endorsed the idea of supporting several different views and commented
  that many of his `bugs' turn out to be shallow errors such as misspelling
  a function identifier in the final equation of a definition.
  
  John would be keen to visit York to participate in the next Tracing Fest,
  so we should arrange it with his availability in mind.

* John Hughes has a long-standing fault in a ~5000 line Haskell program --
  his type specialiser -- and would very much like to fix it.  If we could
  find and fix John's problem it would be a nice `trophy' for Hat, so I
  will be asking him for more details.  If it looks like a problem that
  Hat could reach, it may be worth another trip to Sweden!

* Koen Claessen is still working on (Veri-)QuickCheck, which is another
  tool for finding faults.  Koen would be another good beta-tester: we
  might even persuade him to come to York for a Tracing Fest.

* Lennart Augustsson thought he might want traces of more than 2Gb!
  He is not worried by the size of traces, but would be put off if
  trace generation remains so slow -- a factor of 5 is no problem, but
  a factor of 50 is too painful.

* Simon Peyton Jones wanted to know if one could mark points in the
  source, as with Hood.  (Simon has always seemed to favour source
  annotation as a hook for tools.)  I indicated that all the information
  was there for Hat to use links to/from source, though at present
  that was not the way hat-observe was driven.  Of course Simon wants
  to know when Hat will work with ghc!  He was also concerned about
  the performance penalty.
  
* Thomas Johnsson thought it should be possible to make further use of
  lower-level primitives to gain speed -- without depending on details
  specific to compilers.  (No, he didn't offer any details!)

* Karl-Filip Faxen was interested in the tracer -- we travelled back to
  Stockholm together and did some more examples on the train.  He would
  like to visit us later in the year and would be pleased to take part
  in a Tracing Fest.  (He also has a high-performance whole-program compiler
  for a Haskell-like language -- I must ask him whether he could use
  nhc as a front-end.)
  
* Eric Meijer seemed keen on the idea of one-run, many observations.
  However, as a salaried Microsoft employee in a product-oriented
  division, he cannot be seen to make use of Linux.  Even Cygwin is
  not quite respectable.  It has to be undiluted Windoze!