HAT file format ART Memo 14 (Version 3) Malcolm Wallace, 11 April 2001 The Hat file is a sequence of nodes, each of which is a sequence of bytes. There are several types of node, of varying size. Every node is introduced by a tag byte, as described in this diagram: Tag byte in 'Hat v02' trace archive files ========================================= MSB LSB 7 6 5 4 3 2 1 0 bit number --------------------------------------------------------------- | 0 | node kind | 0 | other information | --------------------------------------------------------------- --------------------------------------------------------------- | 0 | 00 = Trace | 0 | 0 | 0 | 00 = Ap | | | | | | | 01 = Symbol | | | | | | | 10 = Proj | | | | | | | 11 = Hidden | | | | | | 1=Sat | 00 = Applied | | | | | | | 01 = Blackhole| | | | | | | 10 = Concluded| | | | | | | 11 = _ | --------------------------------------------------------------- --------------------------------------------------------------- | 0 | 01 = Module | 0 | 0 | 0 | 0 |0=susp | | | | | | | |1=trust| --------------------------------------------------------------- --------------------------------------------------------------- | 0 | 10 = Atom | 0 | 0000 = Int literal | | | | | 0001 = Char literal | | | | | 0010 = Integer literal | | | | | 0011 = Rational literal | | | | | 0100 = Float literal | | | | | 0101 = Double literal | | | | | 0110 = Identifier toplevel | | | | | 0111 = Identifier local | | | | | 1000 = CString | | | | | 1001 = Abstract | | | | | 1010 = Case | | | | | 1011 = Lambda | | | | | 1100 = If | | | | | 1101 = Guard | | | | | 1110 = NamedField | --------------------------------------------------------------- --------------------------------------------------------------- | 0 | 11 = SrcRef | 0 | 0 | 0 | 0 | 0 | --------------------------------------------------------------- Following each tag, there is a sequence of bytes. The following table defines how many bytes, and how they are to be interpreted. We define a `fileoffset' to be an unsigned 4-byte integer. All values are stored in big-endian order in the file, regardless of the native machine ordering. All strings (except in the header) are represented as a one-byte length, the literal character bytes, and no \0 terminator. File header =========== Beginning of file 8 literal character bytes ("Hat v02\0") ----------------- + fileoffset (error entry point for trace) + fileoffset (error explanation -> Atom CString) Bytes to follow tag byte (v02) ======================== Trace ----- 0000 0000 Ap 1 byte (arity) + fileoffset (trace of Ap) + fileoffset (srcref) + (arity+1) * fileoffset (traces of fn and args) 0000 0001 Symbol fileoffset (trace of parent of Symbol node) + fileoffset (Atom) + fileoffset (srcref) 0000 0010 Projection fileoffset (trace of projective parent) + fileoffset (srcref of projective parent) + fileoffset (trace of value) 0000 0011 Hidden fileoffset (trace) 0000 0100 Sat Applied fileoffset (trace) 0000 0101 Sat Blckhl fileoffset (trace) 0000 0110 Sat Conc fileoffset (trace) 0000 0111 _ Module ------ 0010 000x 1 byte (length n in bytes) + n * bytes (module name) + 1 byte (length n) + n * bytes (src filename) Atom --------- 0100 0000 Int 4 bytes (Int) 0100 0001 Char 1 byte (Char) 0100 0010 Integer 1 byte (length in words) + length * 4 bytes (Integer value) 0100 0011 Rational 1 byte (length numerator) + length numerator * 4 bytes (actual numerator) + 1 byte (length denominator) + length denominator * 4 bytes (actual denominator) 0100 0100 Float 4 bytes (Float) 0100 0101 Double 8 bytes (Double) 0100 0110 Identifier fileoffset (modinfo) -- top-level identifiers + 1 byte (infix priority) + 4 bytes (defn posn) + 1 byte (length n) + n * bytes (name) 0100 0111 LocalIdent fileoffset (modinfo) -- local identifiers + 1 byte (infix priority) + 4 bytes (defn posn) + 1 byte (length n) + n * bytes (name) 0100 1000 CString 1 byte (length n) + n * bytes (string) 0100 1001 Abstract 1 byte (length n) + n * bytes (description of value) 0100 1010 Case nothing 0100 1011 Lambda nothing 0100 1100 If nothing 0100 1101 Guard nothing 0100 1110 NamedField nothing SrcRef ------ 0110 0000 fileoffset (module info) + 4 bytes (posn of use)