1 The compressor achieves an average compression rate of 60% of the
2 original size which is on par with "gzip". It seems that you cannot do
3 much better for compressing compiled binaries. This means that the
4 break even point for using compressed images is reached, once the
5 uncompressed size approaches 1.5kB. We can stuff more than 12kB into
6 an 8kB EPROM and more than 25kB into an 16kB EPROM. As there is only
7 32kB of RAM for both the uncompressed image and its BSS area, this
8 means that 32kB EPROMs will hardly ever be required.
10 The compression algorithm uses a 4kB ring buffer for buffering the
11 uncompressed data. Before compression starts, the ring buffer is
12 filled with spaces (ASCII character 0x20). The algorithm tries to
13 find repeated input sequences of a maximum length of 60 bytes. All
14 256 different input bytes plus the 58 (60 minus a threshold of 2)
15 possible repeat lengths form a set of 314 symbols. These symbols are
16 adaptively Huffman encoded. The algorithm starts out with a Huffmann
17 tree that assigns equal code lengths to each of the 314 symbols
18 (slightly favoring the repeat symbols over symbols for regular input
19 characters), but it will be changed whenever the frequency of any of
20 the symbols changes. Frequency counts are kept in 16bit words until
21 the total number of compressed codes totals 2^15. Then, all frequency
22 counts will be halfed (rounding to the bigger number). For unrepeated
23 characters (symbols 0..255) the Huffman code is written to the output
24 stream. For repeated characters the Huffmann code, which denotes the
25 length of the repeated character sequence, is written out and then the
26 index in the ring buffer is computed. From this index, the algorithm
27 computes the offset relative to the current index into the ring
28 buffer. Thus, for typical input data, one would expect that short to
29 medium range offsets are more frequent than extremely short or medium
30 range to long range offsets. Thus the 12bit (for a 4kB buffer) offset
31 value is statically Huffman encoded using a precomputed Huffman tree
32 that favors those offset values that are deemed to be more
33 frequent. The Huffman encoded offset is written to the output data
34 stream, directly following the code that determines the length of
37 This algorithm, as implemented in the C example code, looks very good
38 and its operating parameters are already well optimized. This also
39 explains why it achieves compression ratios comparable with
40 "gzip". Depending on the input data, it sometimes excells considerably
41 beyond what "gzip -9" does, but this phenomenon does not appear to be
42 typical. There are some flaws with the algorithm, such as the limited
43 buffer sizes, the adaptive Huffman tree which takes very long to
44 change, if the input characters experience a sudden change in
45 distribution, and the static Huffman tree for encoding offsets into
46 the buffer. The slow changes of the adaptive Huffman tree are
47 partially counteracted by artifically keeping a 16bit precision for
48 the frequency counts, but this does not come into play until 32kB of
49 compressed data is output, so it does not have any impact on our use
50 for "etherboot", because the BOOT Prom does not support uncompressed
51 data of more then 32kB (c.f. doc/spec.doc).
53 Nonetheless, these problems do not seem to affect compression of
54 compiled programs very much. Mixing object code with English text,
55 would not work too well though, and the algorithm should be reset in
56 between. Actually, we might gain a little improvement, if text and
57 data segments were compressed individually, but I have not
58 experimented with this option, yet.