====================== REDUCERON MEMO 4 Chunky lists Matthew N, 17 November ====================== In Memo 1, I considered the possibility of a "chunky list", that is a list that can be constructed from (say) four heads as well as just from one. data List a = Nil | Cons a (List a) | Cons4 a a a a (List a) Let's take a few standard functions over lists: a consumer (sum), a processor (map), and a producer (replicate). sumL :: List Int -> Int sumL Nil = 0 sumL (Cons x xs) = x + sumL xs mapL f Nil = Nil mapL f (Cons x xs) = Cons (f x) (mapL f xs) replicateL :: Int -> a -> List a replicateL n a = if n <= 0 then Nil else Cons a (replicateL (n-1) a) Currently these functions ignore the Cons4 constructor. The intuition behind Cons4 is that is satisfies the following law. Cons4 x0 x1 x2 x3 xs = Cons x0 (Cons x1 (Cons x2 (Cons x3 xs))) (1) Now we want to synthesise a right-hand-side for sumL (Cons4 x0 x1 x2 x3 xs) which by compile-time evaluation proceeds as follows sumL (Cons4 x0 x1 x2 x3 xs) = sumL (Cons x0 (Cons x1 (Cons x2 (Cons x3 xs)))) = x0 + sumL (Cons x1 (Cons x2 (Cons x3 xs))) ... = x0 + x1 + x2 + sumL xs We can take a similar attack on map. mapL f (Cons4 x0 x1 x2 x3 xs) = mapL f (Cons x0 (Cons x1 (Cons x2 (Cons x3 xs)))) = Cons (f x0) (mapL f (Cons x1 (Cons x2 (Cons x3 xs)))) ... = Cons (f x0) (Cons (f x1) (Cons (f x2) (Cons (f x3) (mapL f xs)))) By equation (1) we get Cons4 (f x0) (f x1) (f x2) (f x3) (mapL f xs) In replicate, we can inline the recursive call three times. if n <= 0 then Nil else Cons a (if n-1 <= 0 then Nil else Cons a (if n-2 <= 0 then Nil else Cons a (if n-3 <= 0 then Nil else Cons a (replicateL (n-4) a)))) If the condition cannot evaluate to _|_ then a condtional can be rewritten using the distribution law f (if cond then e0 else e1) = if cond then f e0 else f e1 Since n is evaluated in the outermost condition, the distribution law can be applied in both its branches. if n <= 0 then Nil else if n <= 1 then Cons a Nil else if n <= 2 then Cons a (Cons a Nil) else if n <= 3 then Cons a (Cons a (Cons a Nil)) else Cons a (Cons a (Cons a (Cons a (replicateL (n-4) a)))) By equation (1) we get if n <= 0 then Nil else if n <= 1 then Cons a Nil else if n <= 2 then Cons a (Cons a Nil) else if n <= 3 then Cons a (Cons a (Cons a Nil)) else Cons4 a a a a (replicateL (n-4) a) Experimental results -------------------- The number of clock-cycles taken to compute sumL (mapL sumL (replicateL 100 (replicateL 100 0))) using the original definitions is 269043 and using the wide definitions is 170992. That's a 36% improvement. Conclusion ---------- Without sacrificing laziness we can transform programs to use chunky lists, giving a nice performance improvement (at least in one artificial example, which could probably be better improved using fusion!) The extent to which the transformation can be reliably done automatically remains unknown. Discussion (added 18 November) ------------------------------ The chunky benchmark would be even more efficient if you transformed replicateL to test for the chunky case *first*, and in other cases to use a binary chop: if n > 3 then Cons4 ... else if n > 1 then if n > 2 then ... else ... else if n > 0 then ... else ... As you say the extent to which listy functions can be automatically extended to chunky functions is uncertain. So there is the likely problem of switching to and fro between representations. The overhead of expanding Cons4 for the benefit of non-chunky functions may not be too bad. But there is the much more awkward issue of whether it is possible to chunkify long lists those functions may produce in standard Cons-Nil form.