====================== REDUCERON MEMO 23 York Lava, by example Matthew N, 22 May 2009 ====================== York Lava is Haskell library for describing digital circuits. Descriptions can be simulated in Hugs or GHC, and converted to VHDL compatible with XST, the Xilinx Synthesis Tool. It is largely compatible with Chalmers Lava, but also offers a few new bits and bobs. Simulation ---------- York Lava provides an abstract data type 'Bit' and a set of primitive operators over values of this type. For example, the operator for a two-input AND gate is written '<&>' and has type Bit -> Bit -> Bit Expressions involving values of type 'Bit' can be simulated at the Lava prompt. Lava> low <&> high low Lava> map inv [low, high, low] [high,low,high] By default, only the output of the circuit on the 1st clock-cycle is displayed. More generally, 'simulateN' gives the outputs on the first N cycles. Lava> simulateN 3 $ map inv [low, high, low] [[high,low,high],[high,low,high],[high,low,high]] Lava> simulateN 4 $ delay low high [low,high,high,high] The 'delay' operator represents a D-type flip-flop; it outputs its first argument on the first clock-cycle, and thereafter outputs its second argument delayed by one clock-cycle. Alternatively, 'simulate' gives the outputs on all cycles. Lava> simulate $ let x = delay low (inv x) in x [low,high,low,high,low,high,... Note the feedback loop in this example: the flip-flop's output is connected to its own input via an inverter. And 'simulateSeq' allows a constant waveform to be specified as input. Lava> simulateSeq (delay low) [low, high, low, high, low] [low,low,high,low,high] A conversion function from bits to booleans ('bitToBool') is provided; this is useful for defining custom simulation routines, and also for testing circuit properties using tools such as QuickCheck. Lava> takeWhile bitToBool $ simulate $ delay high (delay high low) [high,high] Synthesis --------- Further to simulation, Haskell functions over bits can also be synthesised. For example, the function 'halfAdd' halfAdd :: Bit -> Bit -> (Bit, Bit) halfAdd a b = (sum, carry) where sum = a <#> b carry = a <&> b can be converted to a VHDL entity with inputs named "a" and "b" and outputs named "sum" and "carry". synthesiseHalfAdd :: IO () synthesiseHalfAdd = writeVhdl "HalfAdd" (halfAdd (name "a") (name "b")) (name "sum", name "carry") The VHDL is written to the directory specified by the first argument of 'writeVhdl'. (Sometimes more than one source file is generated, for example when block-RAMs are used.) Lava> synthesiseHalfAdd Creating directory 'HalfAdd/' Writing to 'HalfAdd/HalfAdd.vhd' Done. Sized vectors ------------- Values of type 'Vec n a' are vectors containing 'n' elements of type 'a'. Bit-vectors are a commonly-used special case. type Word n = Vec n Bit Valid numeric types are 'Z', 'S Z', 'S (S Z)' and so on; the type constructors 'N0' to 'N127' are pre-defined as type synonyms. The 'Word' type is a member if Haskell's 'Num' class. Lava> 13 :: Word N4 Vec [high,low,high,high] Lava> 4+3 :: Word N8 Vec [high,high,high,low,low,low,low,low] Note the least-significant bit comes first. Each numeric type contains exactly one value. For example, the type 'S Z' contains the value 'S Z'. All type-level numbers are members of the type class 'N'. class N n where value :: n -> Int The values 'n0' to 'n127', with types 'N0' to 'N127' respectively, are predefined as constants. Lava> value n4 4 The 'value' function is non-strict; its result is computed entirely from the *type* of its argument. The function 'vat' indexes a vector. Lava> (13 :: Word N4) `vat` n2 low A static type error is given if the index is out of bounds. Vectors can be turned into lists, for more convenient processing. Lava> velems (13 :: Word N4) [high,low,high,high] Many other operations on vectors are provided - see the module 'Lava.Vector'. Generic circuits ---------------- Many of the functions exported by Lava are generic in the sense that they operate over arbitrary collections of bits, not simply single bits. For example, the function '===' has type Generic a => a -> a -> Bit and can be used to compare any two structures of bits. Lava> low === high low Lava> (low, high, low) === (low, high, low) high Examples of other generic circuits include: Lava> orG (low, (low, low), high) high Lava> low ? ([low, high], [high, low]) [high, low] Lava> pickG [(low, 0), (low, 1), (high, 2), (low, 3)] :: Word N4 Vec [low,high,low,low] Lava> simulateN 4 $ delay (low, high) (high, high) [(low,high),(high,high),(high,high),(high,high)] See the 'Lava.Prelude' module for more generic circuits. It is straightforward to make any algebraic data type a member of the 'Generic' class using the helper functions 'cons' and '><'. For example, here is the list instance: instance Generic a => Generic [a] where generic [] = cons [] generic (a:as) = cons (:) >< a >< as Memory ------ York Lava supports primitive components which produce multiple output bits. Random-access memories (RAMs), which have data-output buses, are a prime example. ram :: [Integer] -> RamAlgorithm -> RamInputs n m -> Word n The 1st argument states what the locations beginning at address 0 should be initialised to. The list of initial values may have a length smaller than the number of locations in the RAM; any locations whose initial values are unspecified are initialised to 0. The 2nd argument specifies to the Xilinx Core Generator which cascading algorithm should be used when making big RAMs out of individual block-RAMs. data RamAlgorithm = MinArea | Width1 | Width2 | Width4 | Width9 | Width18 | Width36 Choosing 'MinArea' causes the "Minimum Area" algorithm to be used; any other choice results in the "Fixed Primitives" algorithm being used. See the Xilinx Core Generator documentation for details on these algorithms. The 3rd argument species the data-input and address buses, along with a write-enable signal. The type parameters 'n' and 'm' are the widths of the data and address buses respectively. data RamInputs n m = RamInputs { ramData :: Word n , ramAddress :: Word m , ramWrite :: Bit } True dual-port RAMs are also provided through the 'dualRam' primitive, the main difference being that they take a pair of 'RamInputs' and produce a pair of data-output buses. Memory collisions are resolved using the Xilinx "WRITE_FIRST" semantics. Behavioural description ----------------------- York Lava provides the following constructs for writing behavioural descriptions. Skip :: Recipe Tick :: Recipe (<==) :: Var n -> Word n -> Recipe Seq :: [Recipe] -> Recipe Par :: [Recipe] -> Recipe (|>) :: Bit -> Recipe -> Recipe While :: Bit -> Recipe -> Recipe (For a precise semantics of this language, see Memo 24.) There are two steps to writing behavioural descriptions in York Lava. The first is to define a state type (say 'State') that contains all the state variables to be used in the description. And the second is to define a function of type 'State -> Recipe' which manipulates the state in desired way. To illustrate, consider the implementation of a sequential multiplier using the shift-and-add algorithm. In this case, the state type contains three variables: the two inputs to multiply, and the result of the multiplication. data Mult n = Mult { a, b, result :: Var n } A value of type 'Mult n' is created by 'newMult'. newMult :: N n => New (Mult n) newMult = return Mult `ap` newReg `ap` newReg `ap` newReg The shift-and-add recipe takes a value of type 'Mult n'. shiftAndAdd s = While (s!b!val =/= 0) $ Seq [ s!a <== s!a!val!shr , s!b <== s!b!val!shl , s!b!val!vhead |> s!result <== s!result!val + s!a!val , Tick ] shr x = low +> vinit x shl x = vtail x <+ low Three remarks are in order: 1. The '!' operator is flipped application with a high precedence. infixl 9 ! (!) :: a -> (a -> b) -> b x!f = f x This gives descriptions an appropriate object-oriented flavour. 2. The value of a variable is obtained using the function: val :: Var n -> Word n 3. The functions '+>' and '<+' perform cons and snoc operations on vectors, 'vhead' takes the head of a vector, and '=/=' is generic disequality. To actually perform a multiplication, the input variables need to be initialised. multiply x y s = Seq [ s!a <== x, s!b <== y, Tick, s!shiftAndAdd ] For example, an eight-bit multiplier that computes 5*5. example :: Mult N8 -> Recipe example s = s!multiply 5 5 The example can simulated using 'simRecipe', which a value of type 'New s' and passes the state (of type 's') to a function of type 's -> Recipe'. It performs the recipe, and then applies a projection function to the final state, to select the desired outputs. simExample = simRecipe newMult example result Lava> simExample Vec [high,low,low,high,high,low,low,low] The example can be synthesised by calling 'recipe' which is similar to 'simRecipe' but takes a single-cycle pulse indicating when to start, and produces a single-cycle pulse indicating when the recipe has finished. synExample = do let (s, done) = recipe newMult example (delay high low) writeVhdl "MultExample" (s!result!val, done) (nameWord "result", name "done") Poly: a simple stack processor ------------------------------ To complete this overview, a simple processor is defined. The aim of the processor is to evaluate expressions of the form data Expr = X | N Integer | Expr :+: Expr | Expr :*: Expr For example: expr = (X :+: X) :+: (N 2 :*: X) :+: X The semantics of expressions is as follows. eval :: Expr -> Integer -> Integer eval X n = n eval (N i) n = i eval (e1 :+: e2) n = eval e1 n + eval e2 n eval (e1 :*: e2) n = eval e1 n * eval e2 n For example: Lava> eval expr 3 15 A stack machine to evaluate expressions might have the following instructions. data INSTR = LIT Integer | DUP | REV | ADD | MUL | HALT Expressions can be compiled to a sequence of such instructions. compile :: Expr -> [INSTR] compile e = comp e ++ [HALT] comp X = [] comp (N n) = [LIT n] comp (e1 :+: e2) = DUP:comp e2 ++ [REV] ++ comp e1 ++ [ADD] comp (e1 :*: e2) = DUP:comp e2 ++ [REV] ++ comp e1 ++ [MUL] For example: Lava> compile expr [DUP,REV,DUP,DUP,REV,LIT 2,MUL,REV,DUP,REV,ADD,ADD,ADD,HALT] The semantics of the stack machine is as follows. exec :: [INSTR] -> [Integer] -> Integer exec (LIT m : c) (n:s) = exec c (m:s) exec (DUP : c) (m:s) = exec c (m:m:s) exec (REV : c) (m:n:s) = exec c (n:m:s) exec (ADD : c) (m:n:s) = exec c (m+n:s) exec (MUL : c) (m:n:s) = exec c (m*n:s) exec (HALT : c) (n:s) = n To implement the machine at the digital-circuit level, a bit-level encoding of the instruction set is required. encode :: INSTR -> Integer encode (LIT n) = 1 + 2*fromIntegral n encode DUP = 2 encode REV = 4 encode ADD = 8 encode MUL = 16 encode HALT = 32 type DataWidth = N8 type Instr = Word DataWidth type Value = Word DataWidth isLIT, isDUP, isREV, isADD, isMUL, isHALT :: Instr -> Bit isLIT i = i `vat` n0 isDUP i = inv (isLIT i) <&> (i `vat` n1) isREV i = inv (isLIT i) <&> (i `vat` n2) isADD i = inv (isLIT i) <&> (i `vat` n3) isMUL i = inv (isLIT i) <&> (i `vat` n4) isHALT i = inv (isLIT i) <&> (i `vat` n5) getLIT :: Instr -> Value getLIT i = vtail i <+ low Note the bit-width of the machine is parameterised by 'DataWidth'. The machine state consists of an instruction stack, a value stack, a multiplier, and a register storing the top element of the value stack. data Poly = Poly { code :: Stack DataWidth AddrWidth , stack :: Stack DataWidth AddrWidth , mult :: Mult DataWidth , rtop :: Var DataWidth } type AddrWidth = N10 The implementation of the 'Stack' data type is not be presented here (but can be found in the examples directory); it is sufficient to know that it supports the following operations. top :: Stack n m -> Word n pop :: Stack n m -> Recipe push :: Word n -> Stack n m -> Recipe The function 'newPoly' takes the program (used to initialise the instruction stack) and an initial value for the top of the value stack. newPoly :: [Integer] -> Integer -> New Poly newPoly code x = return Poly `ap` newStack code `ap` newStack [] `ap` newMult `ap` newRegInit (fromIntegral x) The behaviour of the processor is as follows. poly :: Poly -> Recipe poly s = let instr = s!code!top in Seq [ Tick , While (instr!isHALT!inv) $ Seq [ isLIT instr |> s!rtop <== getLIT instr , isDUP instr |> s!stack!push (s!rtop!val) , isREV instr |> Seq [ s!rtop <== s!stack!top , s!stack!pop , s!stack!push (s!rtop!val) ] , isADD instr |> Seq [ s!rtop <== s!rtop!val + s!stack!top , s!stack!pop ] , isMUL instr |> Seq [ s!mult!multiply (s!rtop!val) (s!stack!top) , s!rtop <== s!mult!result!val , s!stack!pop ] , s!code!pop , Tick ] ] Notice that each instruction executes in one clock-cycle, apart from multiplication which may take up to 'DataWidth' clock-cycles. To test the processor, a custom simulation routine is useful. simPoly :: Expr -> Integer -> Value simPoly e x = simRecipe (newPoly code x) poly (val . rtop) where code = reverse $ map enc $ compile e Lava> simPoly expr 3 Vec [high,high,high,high,low,low,low,low] The function 'simPoly' has almost the same type has 'eval', and it is straightforward to define a correctness property for the processor. prop_poly :: Expr -> Integer -> Bool prop_poly e x = eval e x == wordToInt (simPoly e x)