5 Decompression is going to be much faster than before, output will be direct
6 into either the output byte array or the overflow buffer. This means much less
7 indirection with swapping buffers and such. When the output array is filled it
8 will start going into the overflow buffer. This means that when compressed
9 packets are read they can be done in whole parts. This means that there will
10 be no throwing or handling of exceptions such as `PipeStalledException`. As
11 such this should be a very large speed boost. I do wonder though if I have
12 that busy box ZIP file. I do not though, but I could make up another. It would
13 also be much more efficient too.
17 For some reason the dependencies for `sliding-window` are not including the
18 library needed for the datadeque.
22 Let me see. Do I read MSB then flip, or read LSB then flip? Well, at the byte
23 level the order does not matter. But since values are generally MSB, I read a
24 small chunk of bits in the input from the bytes then just shift them into
25 place. LSB bit reads just reverse the bits.
29 Actually, I think it would be much more optimized if instead of the mini
30 window being a bunch of bytes, it is really just two int values. One will be
31 above, the other is below. Then reading values, I can simplify the loop and I
32 do not have to have complex code.
36 I wonder what is a fast way to reverse the bits in a byte is.
40 Appears that my reading of the fixed huffman table is not correct.
44 I need to read `M`, but I read `c`.
48 I want to read 77 for `M`. Which means huffman wise: 01xxxxxx with a
49 6 bit read. Since the base is 16 that means I want 61 or 0b111101. So the file
52 * 0b0 -- Not final block
53 * 0b01 -- Fixed huffman
54 * 0b01111101 -- Fixed literal code for M
59 * 0b01 -- Fixed huffman
60 * 0b10010011 -- Fixed literal code for c
64 The first 3 bytes of the file:
74 So I see where it is going, the bytes from the input are being read
75 incorrectly. So I suppose if I swap the values I read.
79 I should work with known input and output first.
83 So this new test code is a class file, which means the first byte read must be
86 DEBUG -- Fixed Code: 201 c9 0b11001001
88 Close, but it is wrong.
92 So from my working code it is
94 DEBUG -- Read: rv=1 (0b1) msb=true bits=1
95 DEBUG -- Read: rv=2 (0b10) msb=false bits=2
96 DEBUG -- Read: rv=2 (0b10) msb=false bits=5
100 Means that my code for reading is flipping bits for values when it should not
105 Ok so right now I have:
107 DEBUG -- Read: rv=1 (0b1) msb=false bits=1
108 DEBUG -- Read: rv=2 (0b10) msb=false bits=2
109 DEBUG -- Read: rv=15 (0b1111) msb=false bits=5
113 DEBUG -- Read: rv=1 (0b1) msb=true bits=1
114 DEBUG -- Read: rv=2 (0b10) msb=false bits=2
115 DEBUG -- Read: rv=2 (0b10) msb=false bits=5
116 DEBUG -- Read: rv=11 (0b1011) msb=false bits=5
117 DEBUG -- Read: rv=14 (0b1110) msb=false bits=4
118 DEBUG -- Read: rv=5 (0b101) msb=false bits=3
120 Which means I am reading some bits wrong.
124 Or that the input is wrong also.
130 DEBUG -- Read: rv=15 (0b1111) msb=false bits=5
131 DEBUG -- mw=00000052 ms=8
132 DEBUG -- Read: rv=18 (0b10010) msb=false bits=5
133 DEBUG -- mw=000002ea ms=11
134 DEBUG -- Read: rv=10 (0b1010) msb=false bits=4
136 filling the bytes I get `01010100111110` and the top sequence is...
137 `00010010111110`. Putting them side by side:
138 `01010100111110`. Probably an error, will retype:
139 `01010100111110`. Some of those bits are shifted off. In normal order:
140 `01111100101010`. But what should be read first is:
141 `00010` for the 5 bit two value. So the values are read into the window in the
146 The bits are not reversed in the window however.
150 I have no idea at all what is wrong with this code and I have no idea why my
151 other code reads the right values.
155 So the working code has a completely different input source.
163 The input is `7d525d4fd360147e5` but the actual read input is `15cb3b`. So
164 where are these bytes coming from? And with my old code, trying to change the
165 test code changes nothing.
169 Actually I was debugging the input for the wrong code.
173 Ok, now I have the right sequences.
177 Double increment made code lengths not decompress properly.
181 So the current thing is that written data appears to be garbage.
187 DEBUG -- Length: 3, Distance: 5
188 DEBUG -- Write 0x00 (?)
189 DEBUG -- Write 0x1f (?)
190 DEBUG -- Write 0x0a (?)
191 DEBUG -- Length: 5, Distance: 12
195 DEBUG -- Length: 4, Distance: 8
196 DEBUG -- Write 00 (?)
197 DEBUG -- Write 1e (?)
198 DEBUG -- Write 07 (?)
199 DEBUG -- Write 00 (?)
201 DEBUG -- Write 20 ( )
203 DEBUG -- Write 0a (?)
205 DEBUG -- Write 00 (?)
206 DEBUG -- Write 04 (?)
208 DEBUG -- Write 00 (?)
210 DEBUG -- Write 21 (!)
212 DEBUG -- Length: 4, Distance: 13
214 So the dyanmic distance or length is being read properly. Likely it is the
215 length that is incorrectly read.
219 Off by one for lengths is likely due to the zero check.
223 So it looks like my file is missing some data:
225 000200220800230a000200240700250a000900260a000900270a000200280a000200290
226 9002a002b0a0014002c0a0014002d0a002e002f0800300800310700320700330100063c
227 696e69743e010003282956010004436f646501000568656c6c6f01001428294c6a61766
228 12f6c616e672f537472696e673b0100046d61696e010016285b4c6a6176612f6c616e67
229 2f537472696e673b2956010005776f726c640c001600170100176a6176612f6c616e672
230 f537472696e674275696c6465720100136a6176612f6c616e672f436861726163746572
231 0c001600340c00350036010003656c6c0c003500370100116a6176612f6c616e672f496
232 e74656765720c001600380c0039003a0c0035003b0c003c001a07003d0c003e003f0c00
233 19001a0c001d001a0700400c0041004201000157010003726c6401000a48656c6c6f576
234 f726c640100106a6176612f6c616e672f4f626a65637401000428432956010006617070
235 656e6401002d284c6a6176612f6c616e672f4f626a6563743b294c6a6176612f6c616e6
236 72f537472696e674275696c6465723b01002d284c6a6176612f6c616e672f537472696e
237 673b294c6a6176612f6c616e672f537472696e674275696c6465723b010004284929560
238 10008696e7456616c756501000328294901001c2843294c6a6176612f6c616e672f5374
239 72696e674275696c6465723b010008746f537472696e670100106a6176612f6c616e672
240 f53797374656d0100036f75740100154c6a6176612f696f2f5072696e7453747265616d
241 3b0100136a6176612f696f2f5072696e7453747265616d0100077072696e746c6e01001
242 5284c6a6176612f6c616e672f537472696e673b29560021001400150000000000040001
243 00160017000100180000001100010001000000052ab70001b10000000000090019001a0
244 001001800000038000400000000002cbb000259b70003bb0004591048b70005b6000612
245 07b60008bb000959106fb7000ab6000b92b6000cb6000db0000000000089001b001c000
246 100180000002e0003000100000022b2000ebb000259b70003b8000fb600081020b6000c
247 b80010b60008b6000db60011b1000000000009001d001a0001001800000031000400000
248 0000025bb000259b700031212b60008bb000959106fb7000ab6000b92b6000c1213b600
249 08b6000db0000000000000
251 The `CAFEBABE` header on this class is missing.
255 But this is probably due to the read count not being read properly.
261 cafebabe0000003400430a0015001e07001f0a0002001e0700200a000400210a0002002
262 20800230a000200240700250a000900260a000900270a000200280a0002002909002a00
263 2b0a0014002c0a0014002d0a002e002f0800300800310700320700330100063c696e697
264 43e010003282956010004436f646501000568656c6c6f01001428294c6a6176612f6c61
265 6e672f537472696e673b0100046d61696e010016285b4c6a6176612f6c616e672f53747
266 2696e673b2956010005776f726c640c001600170100176a6176612f6c616e672f537472
267 696e674275696c6465720100136a6176612f6c616e672f4368617261637465720c00160
268 0340c00350036010003656c6c0c003500370100116a6176612f6c616e672f496e746567
269 65720c001600380c0039003a0c0035003b0c003c001a07003d0c003e003f0c0019001a0
270 c001d001a0700400c0041004201000157010003726c6401000a48656c6c6f576f726c64
271 0100106a6176612f6c616e672f4f626a65637401000428432956010006617070656e640
272 1002d284c6a6176612f6c616e672f4f626a6563743b294c6a6176612f6c616e672f5374
273 72696e674275696c6465723b01002d284c6a6176612f6c616e672f537472696e673b294
274 c6a6176612f6c616e672f537472696e674275696c6465723b0100042849295601000869
275 6e7456616c756501000328294901001c2843294c6a6176612f6c616e672f537472696e6
276 74275696c6465723b010008746f537472696e670100106a6176612f6c616e672f537973
277 74656d0100036f75740100154c6a6176612f696f2f5072696e7453747265616d3b01001
278 36a6176612f696f2f5072696e7453747265616d0100077072696e746c6e010015284c6a
279 6176612f6c616e672f537472696e673b295600210014001500000000000400010016001
280 7000100180000001100010001000000052ab70001b10000000000090019001a00010018
281 00000038000400000000002cbb000259b70003bb0004591048b70005b600061207b6000
282 8bb000959106fb7000ab6000b92b6000cb6000db0000000000089001b001c0001001800
283 00002e0003000100000022b2000ebb000259b70003b8000fb600081020b6000cb80010b
284 60008b6000db60011b1000000000009001d001a00010018000000310004000000000025
285 bb000259b700031212b60008bb000959106fb7000ab6000b92b6000c1213b60008b6000
288 Now to check it against the output.
292 And it matches. Which means decompression works. Took many an hour, but it
297 I wonder how much faster the code is, since it is more direct now.
301 I can probably optimize the sliding window read to not have a bunch of
302 allocations of temporary byte arrays.
306 Decompressing busybox at least for PowerPC (~772KiB) works for about 15
307 seconds before throwing an exception. But, this is better:
309 DEBUG -- nano=13258083193, msec=13258, read=447488
311 This means the read speed is perhaps ~33KiB/s. I would have to search
312 through the blogs to see what my old sped was. The note is on 2016/08/06.
313 But I need a binary that is also 1899912 bytes in size. The PowerPC binary
314 is much smaller, but it probably does not make much of a difference.
318 So if my old speed was ~22KiB/s, this is a good improvement.