gzip: add tests for today's bug fix
[gzip.git] / inflate.c
blobb72c187ee38315c604b236357bee2d33f030f299
1 /* Inflate deflated data
3 Copyright (C) 1997-1999, 2002, 2006, 2009 Free Software Foundation, Inc.
5 This program is free software; you can redistribute it and/or modify
6 it under the terms of the GNU General Public License as published by
7 the Free Software Foundation; either version 3, or (at your option)
8 any later version.
10 This program is distributed in the hope that it will be useful,
11 but WITHOUT ANY WARRANTY; without even the implied warranty of
12 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 GNU General Public License for more details.
15 You should have received a copy of the GNU General Public License
16 along with this program; if not, write to the Free Software Foundation,
17 Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */
19 /* Not copyrighted 1992 by Mark Adler
20 version c10p1, 10 January 1993 */
22 /* You can do whatever you like with this source file, though I would
23 prefer that if you modify it and redistribute it that you include
24 comments to that effect with your name and the date. Thank you.
25 [The history has been moved to the file ChangeLog.]
29 Inflate deflated (PKZIP's method 8 compressed) data. The compression
30 method searches for as much of the current string of bytes (up to a
31 length of 258) in the previous 32K bytes. If it doesn't find any
32 matches (of at least length 3), it codes the next byte. Otherwise, it
33 codes the length of the matched string and its distance backwards from
34 the current position. There is a single Huffman code that codes both
35 single bytes (called "literals") and match lengths. A second Huffman
36 code codes the distance information, which follows a length code. Each
37 length or distance code actually represents a base value and a number
38 of "extra" (sometimes zero) bits to get to add to the base value. At
39 the end of each deflated block is a special end-of-block (EOB) literal/
40 length code. The decoding process is basically: get a literal/length
41 code; if EOB then done; if a literal, emit the decoded byte; if a
42 length then get the distance and emit the referred-to bytes from the
43 sliding window of previously emitted data.
45 There are (currently) three kinds of inflate blocks: stored, fixed, and
46 dynamic. The compressor deals with some chunk of data at a time, and
47 decides which method to use on a chunk-by-chunk basis. A chunk might
48 typically be 32K or 64K. If the chunk is uncompressible, then the
49 "stored" method is used. In this case, the bytes are simply stored as
50 is, eight bits per byte, with none of the above coding. The bytes are
51 preceded by a count, since there is no longer an EOB code.
53 If the data is compressible, then either the fixed or dynamic methods
54 are used. In the dynamic method, the compressed data is preceded by
55 an encoding of the literal/length and distance Huffman codes that are
56 to be used to decode this block. The representation is itself Huffman
57 coded, and so is preceded by a description of that code. These code
58 descriptions take up a little space, and so for small blocks, there is
59 a predefined set of codes, called the fixed codes. The fixed method is
60 used if the block codes up smaller that way (usually for quite small
61 chunks), otherwise the dynamic method is used. In the latter case, the
62 codes are customized to the probabilities in the current block, and so
63 can code it much better than the pre-determined fixed codes.
65 The Huffman codes themselves are decoded using a multi-level table
66 lookup, in order to maximize the speed of decoding plus the speed of
67 building the decoding tables. See the comments below that precede the
68 lbits and dbits tuning parameters.
73 Notes beyond the 1.93a appnote.txt:
75 1. Distance pointers never point before the beginning of the output
76 stream.
77 2. Distance pointers can point back across blocks, up to 32k away.
78 3. There is an implied maximum of 7 bits for the bit length table and
79 15 bits for the actual data.
80 4. If only one code exists, then it is encoded using one bit. (Zero
81 would be more efficient, but perhaps a little confusing.) If two
82 codes exist, they are coded using one bit each (0 and 1).
83 5. There is no way of sending zero distance codes--a dummy must be
84 sent if there are none. (History: a pre 2.0 version of PKZIP would
85 store blocks with no distance codes, but this was discovered to be
86 too harsh a criterion.) Valid only for 1.93a. 2.04c does allow
87 zero distance codes, which is sent as one code of zero bits in
88 length.
89 6. There are up to 286 literal/length codes. Code 256 represents the
90 end-of-block. Note however that the static length tree defines
91 288 codes just to fill out the Huffman codes. Codes 286 and 287
92 cannot be used though, since there is no length base or extra bits
93 defined for them. Similarly, there are up to 30 distance codes.
94 However, static trees define 32 codes (all 5 bits) to fill out the
95 Huffman codes, but the last two had better not show up in the data.
96 7. Unzip can check dynamic Huffman blocks for complete code sets.
97 The exception is that a single code would not be complete (see #4).
98 8. The five bits following the block type is really the number of
99 literal codes sent minus 257.
100 9. Length codes 8,16,16 are interpreted as 13 length codes of 8 bits
101 (1+6+6). Therefore, to output three times the length, you output
102 three codes (1+1+1), whereas to output four times the same length,
103 you only need two codes (1+3). Hmm.
104 10. In the tree reconstruction algorithm, Code = Code + Increment
105 only if BitLength(i) is not zero. (Pretty obvious.)
106 11. Correction: 4 Bits: # of Bit Length codes - 4 (4 - 19)
107 12. Note: length code 284 can represent 227-258, but length code 285
108 really is 258. The last length deserves its own, short code
109 since it gets used a lot in very redundant files. The length
110 258 is special since 258 - 3 (the min match length) is 255.
111 13. The literal/length and distance code bit lengths are read as a
112 single stream of lengths. It is possible (and advantageous) for
113 a repeat code (16, 17, or 18) to go across the boundary between
114 the two sets of lengths.
117 #include <config.h>
118 #include "tailor.h"
120 #if defined STDC_HEADERS || defined HAVE_STDLIB_H
121 # include <stdlib.h>
122 #endif
124 #include "gzip.h"
125 #define slide window
127 /* Huffman code lookup table entry--this entry is four bytes for machines
128 that have 16-bit pointers (e.g. PC's in the small or medium model).
129 Valid extra bits are 0..13. e == 15 is EOB (end of block), e == 16
130 means that v is a literal, 16 < e < 32 means that v is a pointer to
131 the next table, which codes e - 16 bits, and lastly e == 99 indicates
132 an unused code. If a code with e == 99 is looked up, this implies an
133 error in the data. */
134 struct huft {
135 uch e; /* number of extra bits or operation */
136 uch b; /* number of bits in this code or subcode */
137 union {
138 ush n; /* literal, length base, or distance base */
139 struct huft *t; /* pointer to next level of table */
140 } v;
144 /* Function prototypes */
145 int huft_build OF((unsigned *, unsigned, unsigned, ush *, ush *,
146 struct huft **, int *));
147 int huft_free OF((struct huft *));
148 int inflate_codes OF((struct huft *, struct huft *, int, int));
149 int inflate_stored OF((void));
150 int inflate_fixed OF((void));
151 int inflate_dynamic OF((void));
152 int inflate_block OF((int *));
153 int inflate OF((void));
156 /* The inflate algorithm uses a sliding 32K byte window on the uncompressed
157 stream to find repeated byte strings. This is implemented here as a
158 circular buffer. The index is updated simply by incrementing and then
159 and'ing with 0x7fff (32K-1). */
160 /* It is left to other modules to supply the 32K area. It is assumed
161 to be usable as if it were declared "uch slide[32768];" or as just
162 "uch *slide;" and then malloc'ed in the latter case. The definition
163 must be in unzip.h, included above. */
164 /* unsigned wp; current position in slide */
165 #define wp outcnt
166 #define flush_output(w) (wp=(w),flush_window())
168 /* Tables for deflate from PKZIP's appnote.txt. */
169 static unsigned border[] = { /* Order of the bit length code lengths */
170 16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15};
171 static ush cplens[] = { /* Copy lengths for literal codes 257..285 */
172 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 15, 17, 19, 23, 27, 31,
173 35, 43, 51, 59, 67, 83, 99, 115, 131, 163, 195, 227, 258, 0, 0};
174 /* note: see note #13 above about the 258 in this list. */
175 static ush cplext[] = { /* Extra bits for literal codes 257..285 */
176 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2,
177 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 0, 99, 99}; /* 99==invalid */
178 static ush cpdist[] = { /* Copy offsets for distance codes 0..29 */
179 1, 2, 3, 4, 5, 7, 9, 13, 17, 25, 33, 49, 65, 97, 129, 193,
180 257, 385, 513, 769, 1025, 1537, 2049, 3073, 4097, 6145,
181 8193, 12289, 16385, 24577};
182 static ush cpdext[] = { /* Extra bits for distance codes */
183 0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6,
184 7, 7, 8, 8, 9, 9, 10, 10, 11, 11,
185 12, 12, 13, 13};
189 /* Macros for inflate() bit peeking and grabbing.
190 The usage is:
192 NEEDBITS(j)
193 x = b & mask_bits[j];
194 DUMPBITS(j)
196 where NEEDBITS makes sure that b has at least j bits in it, and
197 DUMPBITS removes the bits from b. The macros use the variable k
198 for the number of bits in b. Normally, b and k are register
199 variables for speed, and are initialized at the beginning of a
200 routine that uses these macros from a global bit buffer and count.
201 The macros also use the variable w, which is a cached copy of wp.
203 If we assume that EOB will be the longest code, then we will never
204 ask for bits with NEEDBITS that are beyond the end of the stream.
205 So, NEEDBITS should not read any more bytes than are needed to
206 meet the request. Then no bytes need to be "returned" to the buffer
207 at the end of the last block.
209 However, this assumption is not true for fixed blocks--the EOB code
210 is 7 bits, but the other literal/length codes can be 8 or 9 bits.
211 (The EOB code is shorter than other codes because fixed blocks are
212 generally short. So, while a block always has an EOB, many other
213 literal/length codes have a significantly lower probability of
214 showing up at all.) However, by making the first table have a
215 lookup of seven bits, the EOB code will be found in that first
216 lookup, and so will not require that too many bits be pulled from
217 the stream.
220 ulg bb; /* bit buffer */
221 unsigned bk; /* bits in bit buffer */
223 ush mask_bits[] = {
224 0x0000,
225 0x0001, 0x0003, 0x0007, 0x000f, 0x001f, 0x003f, 0x007f, 0x00ff,
226 0x01ff, 0x03ff, 0x07ff, 0x0fff, 0x1fff, 0x3fff, 0x7fff, 0xffff
229 #define GETBYTE() (inptr < insize ? inbuf[inptr++] : (wp = w, fill_inbuf(0)))
231 #ifdef CRYPT
232 uch cc;
233 # define NEXTBYTE() \
234 (decrypt ? (cc = GETBYTE(), zdecode(cc), cc) : GETBYTE())
235 #else
236 # define NEXTBYTE() (uch)GETBYTE()
237 #endif
238 #define NEEDBITS(n) {while(k<(n)){b|=((ulg)NEXTBYTE())<<k;k+=8;}}
239 #define DUMPBITS(n) {b>>=(n);k-=(n);}
243 Huffman code decoding is performed using a multi-level table lookup.
244 The fastest way to decode is to simply build a lookup table whose
245 size is determined by the longest code. However, the time it takes
246 to build this table can also be a factor if the data being decoded
247 is not very long. The most common codes are necessarily the
248 shortest codes, so those codes dominate the decoding time, and hence
249 the speed. The idea is you can have a shorter table that decodes the
250 shorter, more probable codes, and then point to subsidiary tables for
251 the longer codes. The time it costs to decode the longer codes is
252 then traded against the time it takes to make longer tables.
254 This results of this trade are in the variables lbits and dbits
255 below. lbits is the number of bits the first level table for literal/
256 length codes can decode in one step, and dbits is the same thing for
257 the distance codes. Subsequent tables are also less than or equal to
258 those sizes. These values may be adjusted either when all of the
259 codes are shorter than that, in which case the longest code length in
260 bits is used, or when the shortest code is *longer* than the requested
261 table size, in which case the length of the shortest code in bits is
262 used.
264 There are two different values for the two tables, since they code a
265 different number of possibilities each. The literal/length table
266 codes 286 possible values, or in a flat code, a little over eight
267 bits. The distance table codes 30 possible values, or a little less
268 than five bits, flat. The optimum values for speed end up being
269 about one bit more than those, so lbits is 8+1 and dbits is 5+1.
270 The optimum values may differ though from machine to machine, and
271 possibly even between compilers. Your mileage may vary.
275 int lbits = 9; /* bits in base literal/length lookup table */
276 int dbits = 6; /* bits in base distance lookup table */
279 /* If BMAX needs to be larger than 16, then h and x[] should be ulg. */
280 #define BMAX 16 /* maximum bit length of any code (16 for explode) */
281 #define N_MAX 288 /* maximum number of codes in any set */
284 unsigned hufts; /* track memory usage */
287 int huft_build(b, n, s, d, e, t, m)
288 unsigned *b; /* code lengths in bits (all assumed <= BMAX) */
289 unsigned n; /* number of codes (assumed <= N_MAX) */
290 unsigned s; /* number of simple-valued codes (0..s-1) */
291 ush *d; /* list of base values for non-simple codes */
292 ush *e; /* list of extra bits for non-simple codes */
293 struct huft **t; /* result: starting table */
294 int *m; /* maximum lookup bits, returns actual */
295 /* Given a list of code lengths and a maximum table size, make a set of
296 tables to decode that set of codes. Return zero on success, one if
297 the given code set is incomplete (the tables are still built in this
298 case), two if the input is invalid (all zero length codes or an
299 oversubscribed set of lengths), and three if not enough memory. */
301 unsigned a; /* counter for codes of length k */
302 unsigned c[BMAX+1]; /* bit length count table */
303 unsigned f; /* i repeats in table every f entries */
304 int g; /* maximum code length */
305 int h; /* table level */
306 register unsigned i; /* counter, current code */
307 register unsigned j; /* counter */
308 register int k; /* number of bits in current code */
309 int l; /* bits per table (returned in m) */
310 register unsigned *p; /* pointer into c[], b[], or v[] */
311 register struct huft *q; /* points to current table */
312 struct huft r; /* table entry for structure assignment */
313 struct huft *u[BMAX]; /* table stack */
314 unsigned v[N_MAX]; /* values in order of bit length */
315 register int w; /* bits before this table == (l * h) */
316 unsigned x[BMAX+1]; /* bit offsets, then code stack */
317 unsigned *xp; /* pointer into x */
318 int y; /* number of dummy codes added */
319 unsigned z; /* number of entries in current table */
322 /* Generate counts for each bit length */
323 memzero(c, sizeof(c));
324 p = b; i = n;
325 do {
326 Tracecv(*p, (stderr, (n-i >= ' ' && n-i <= '~' ? "%c %d\n" : "0x%x %d\n"),
327 n-i, *p));
328 c[*p]++; /* assume all entries <= BMAX */
329 p++; /* Can't combine with above line (Solaris bug) */
330 } while (--i);
331 if (c[0] == n) /* null input--all zero length codes */
333 q = (struct huft *) malloc (3 * sizeof *q);
334 if (!q)
335 return 3;
336 hufts += 3;
337 q[0].v.t = (struct huft *) NULL;
338 q[1].e = 99; /* invalid code marker */
339 q[1].b = 1;
340 q[2].e = 99; /* invalid code marker */
341 q[2].b = 1;
342 *t = q + 1;
343 *m = 1;
344 return 0;
348 /* Find minimum and maximum length, bound *m by those */
349 l = *m;
350 for (j = 1; j <= BMAX; j++)
351 if (c[j])
352 break;
353 k = j; /* minimum code length */
354 if ((unsigned)l < j)
355 l = j;
356 for (i = BMAX; i; i--)
357 if (c[i])
358 break;
359 g = i; /* maximum code length */
360 if ((unsigned)l > i)
361 l = i;
362 *m = l;
365 /* Adjust last length count to fill out codes, if needed */
366 for (y = 1 << j; j < i; j++, y <<= 1)
367 if ((y -= c[j]) < 0)
368 return 2; /* bad input: more codes than bits */
369 if ((y -= c[i]) < 0)
370 return 2;
371 c[i] += y;
374 /* Generate starting offsets into the value table for each length */
375 x[1] = j = 0;
376 p = c + 1; xp = x + 2;
377 while (--i) { /* note that i == g from above */
378 *xp++ = (j += *p++);
382 /* Make a table of values in order of bit lengths */
383 p = b; i = 0;
384 do {
385 if ((j = *p++) != 0)
386 v[x[j]++] = i;
387 } while (++i < n);
388 n = x[g]; /* set n to length of v */
391 /* Generate the Huffman codes and for each, make the table entries */
392 x[0] = i = 0; /* first Huffman code is zero */
393 p = v; /* grab values in bit order */
394 h = -1; /* no tables yet--level -1 */
395 w = -l; /* bits decoded == (l * h) */
396 u[0] = (struct huft *)NULL; /* just to keep compilers happy */
397 q = (struct huft *)NULL; /* ditto */
398 z = 0; /* ditto */
400 /* go through the bit lengths (k already is bits in shortest code) */
401 for (; k <= g; k++)
403 a = c[k];
404 while (a--)
406 /* here i is the Huffman code of length k bits for value *p */
407 /* make tables up to required level */
408 while (k > w + l)
410 h++;
411 w += l; /* previous table always l bits */
413 /* compute minimum size table less than or equal to l bits */
414 z = (z = g - w) > (unsigned)l ? l : z; /* upper limit on table size */
415 if ((f = 1 << (j = k - w)) > a + 1) /* try a k-w bit table */
416 { /* too few codes for k-w bit table */
417 f -= a + 1; /* deduct codes from patterns left */
418 xp = c + k;
419 if (j < z)
420 while (++j < z) /* try smaller tables up to z bits */
422 if ((f <<= 1) <= *++xp)
423 break; /* enough codes to use up j bits */
424 f -= *xp; /* else deduct codes from patterns */
427 z = 1 << j; /* table entries for j-bit table */
429 /* allocate and link in new table */
430 if ((q = (struct huft *)malloc((z + 1)*sizeof(struct huft))) ==
431 (struct huft *)NULL)
433 if (h)
434 huft_free(u[0]);
435 return 3; /* not enough memory */
437 hufts += z + 1; /* track memory usage */
438 *t = q + 1; /* link to list for huft_free() */
439 *(t = &(q->v.t)) = (struct huft *)NULL;
440 u[h] = ++q; /* table starts after link */
442 /* connect to last table, if there is one */
443 if (h)
445 x[h] = i; /* save pattern for backing up */
446 r.b = (uch)l; /* bits to dump before this table */
447 r.e = (uch)(16 + j); /* bits in this table */
448 r.v.t = q; /* pointer to this table */
449 j = i >> (w - l); /* (get around Turbo C bug) */
450 u[h-1][j] = r; /* connect to last table */
454 /* set up table entry in r */
455 r.b = (uch)(k - w);
456 if (p >= v + n)
457 r.e = 99; /* out of values--invalid code */
458 else if (*p < s)
460 r.e = (uch)(*p < 256 ? 16 : 15); /* 256 is end-of-block code */
461 r.v.n = (ush)(*p); /* simple code is just the value */
462 p++; /* one compiler does not like *p++ */
464 else
466 r.e = (uch)e[*p - s]; /* non-simple--look up in lists */
467 r.v.n = d[*p++ - s];
470 /* fill code-like entries with r */
471 f = 1 << (k - w);
472 for (j = i >> w; j < z; j += f)
473 q[j] = r;
475 /* backwards increment the k-bit code i */
476 for (j = 1 << (k - 1); i & j; j >>= 1)
477 i ^= j;
478 i ^= j;
480 /* backup over finished tables */
481 while ((i & ((1 << w) - 1)) != x[h])
483 h--; /* don't need to update q */
484 w -= l;
490 /* Return true (1) if we were given an incomplete table */
491 return y != 0 && g != 1;
496 int huft_free(t)
497 struct huft *t; /* table to free */
498 /* Free the malloc'ed tables built by huft_build(), which makes a linked
499 list of the tables it made, with the links in a dummy first entry of
500 each table. */
502 register struct huft *p, *q;
505 /* Go through linked list, freeing from the malloced (t[-1]) address. */
506 p = t;
507 while (p != (struct huft *)NULL)
509 q = (--p)->v.t;
510 free(p);
511 p = q;
513 return 0;
517 int inflate_codes(tl, td, bl, bd)
518 struct huft *tl, *td; /* literal/length and distance decoder tables */
519 int bl, bd; /* number of bits decoded by tl[] and td[] */
520 /* inflate (decompress) the codes in a deflated (compressed) block.
521 Return an error code or zero if it all goes ok. */
523 register unsigned e; /* table entry flag/number of extra bits */
524 unsigned n, d; /* length and index for copy */
525 unsigned w; /* current window position */
526 struct huft *t; /* pointer to table entry */
527 unsigned ml, md; /* masks for bl and bd bits */
528 register ulg b; /* bit buffer */
529 register unsigned k; /* number of bits in bit buffer */
532 /* make local copies of globals */
533 b = bb; /* initialize bit buffer */
534 k = bk;
535 w = wp; /* initialize window position */
537 /* inflate the coded data */
538 ml = mask_bits[bl]; /* precompute masks for speed */
539 md = mask_bits[bd];
540 for (;;) /* do until end of block */
542 NEEDBITS((unsigned)bl)
543 if ((e = (t = tl + ((unsigned)b & ml))->e) > 16)
544 do {
545 if (e == 99)
546 return 1;
547 DUMPBITS(t->b)
548 e -= 16;
549 NEEDBITS(e)
550 } while ((e = (t = t->v.t + ((unsigned)b & mask_bits[e]))->e) > 16);
551 DUMPBITS(t->b)
552 if (e == 16) /* then it's a literal */
554 slide[w++] = (uch)t->v.n;
555 Tracevv((stderr, "%c", slide[w-1]));
556 if (w == WSIZE)
558 flush_output(w);
559 w = 0;
562 else /* it's an EOB or a length */
564 /* exit if end of block */
565 if (e == 15)
566 break;
568 /* get length of block to copy */
569 NEEDBITS(e)
570 n = t->v.n + ((unsigned)b & mask_bits[e]);
571 DUMPBITS(e);
573 /* decode distance of block to copy */
574 NEEDBITS((unsigned)bd)
575 if ((e = (t = td + ((unsigned)b & md))->e) > 16)
576 do {
577 if (e == 99)
578 return 1;
579 DUMPBITS(t->b)
580 e -= 16;
581 NEEDBITS(e)
582 } while ((e = (t = t->v.t + ((unsigned)b & mask_bits[e]))->e) > 16);
583 DUMPBITS(t->b)
584 NEEDBITS(e)
585 d = w - t->v.n - ((unsigned)b & mask_bits[e]);
586 DUMPBITS(e)
587 Tracevv((stderr,"\\[%d,%d]", w-d, n));
589 /* do the copy */
590 do {
591 n -= (e = (e = WSIZE - ((d &= WSIZE-1) > w ? d : w)) > n ? n : e);
592 #if !defined(NOMEMCPY) && !defined(DEBUG)
593 if (w - d >= e) /* (this test assumes unsigned comparison) */
595 memcpy(slide + w, slide + d, e);
596 w += e;
597 d += e;
599 else /* do it slow to avoid memcpy() overlap */
600 #endif /* !NOMEMCPY */
601 do {
602 slide[w++] = slide[d++];
603 Tracevv((stderr, "%c", slide[w-1]));
604 } while (--e);
605 if (w == WSIZE)
607 flush_output(w);
608 w = 0;
610 } while (n);
615 /* restore the globals from the locals */
616 wp = w; /* restore global window pointer */
617 bb = b; /* restore global bit buffer */
618 bk = k;
620 /* done */
621 return 0;
626 int inflate_stored()
627 /* "decompress" an inflated type 0 (stored) block. */
629 unsigned n; /* number of bytes in block */
630 unsigned w; /* current window position */
631 register ulg b; /* bit buffer */
632 register unsigned k; /* number of bits in bit buffer */
635 /* make local copies of globals */
636 b = bb; /* initialize bit buffer */
637 k = bk;
638 w = wp; /* initialize window position */
641 /* go to byte boundary */
642 n = k & 7;
643 DUMPBITS(n);
646 /* get the length and its complement */
647 NEEDBITS(16)
648 n = ((unsigned)b & 0xffff);
649 DUMPBITS(16)
650 NEEDBITS(16)
651 if (n != (unsigned)((~b) & 0xffff))
652 return 1; /* error in compressed data */
653 DUMPBITS(16)
656 /* read and output the compressed data */
657 while (n--)
659 NEEDBITS(8)
660 slide[w++] = (uch)b;
661 if (w == WSIZE)
663 flush_output(w);
664 w = 0;
666 DUMPBITS(8)
670 /* restore the globals from the locals */
671 wp = w; /* restore global window pointer */
672 bb = b; /* restore global bit buffer */
673 bk = k;
674 return 0;
679 int inflate_fixed()
680 /* decompress an inflated type 1 (fixed Huffman codes) block. We should
681 either replace this with a custom decoder, or at least precompute the
682 Huffman tables. */
684 int i; /* temporary variable */
685 struct huft *tl; /* literal/length code table */
686 struct huft *td; /* distance code table */
687 int bl; /* lookup bits for tl */
688 int bd; /* lookup bits for td */
689 unsigned l[288]; /* length list for huft_build */
692 /* set up literal table */
693 for (i = 0; i < 144; i++)
694 l[i] = 8;
695 for (; i < 256; i++)
696 l[i] = 9;
697 for (; i < 280; i++)
698 l[i] = 7;
699 for (; i < 288; i++) /* make a complete, but wrong code set */
700 l[i] = 8;
701 bl = 7;
702 if ((i = huft_build(l, 288, 257, cplens, cplext, &tl, &bl)) != 0)
703 return i;
706 /* set up distance table */
707 for (i = 0; i < 30; i++) /* make an incomplete code set */
708 l[i] = 5;
709 bd = 5;
710 if ((i = huft_build(l, 30, 0, cpdist, cpdext, &td, &bd)) > 1)
712 huft_free(tl);
713 return i;
717 /* decompress until an end-of-block code */
718 if (inflate_codes(tl, td, bl, bd))
719 return 1;
722 /* free the decoding tables, return */
723 huft_free(tl);
724 huft_free(td);
725 return 0;
730 int inflate_dynamic()
731 /* decompress an inflated type 2 (dynamic Huffman codes) block. */
733 int i; /* temporary variables */
734 unsigned j;
735 unsigned l; /* last length */
736 unsigned m; /* mask for bit lengths table */
737 unsigned n; /* number of lengths to get */
738 unsigned w; /* current window position */
739 struct huft *tl; /* literal/length code table */
740 struct huft *td; /* distance code table */
741 int bl; /* lookup bits for tl */
742 int bd; /* lookup bits for td */
743 unsigned nb; /* number of bit length codes */
744 unsigned nl; /* number of literal/length codes */
745 unsigned nd; /* number of distance codes */
746 #ifdef PKZIP_BUG_WORKAROUND
747 unsigned ll[288+32]; /* literal/length and distance code lengths */
748 #else
749 unsigned ll[286+30]; /* literal/length and distance code lengths */
750 #endif
751 register ulg b; /* bit buffer */
752 register unsigned k; /* number of bits in bit buffer */
755 /* make local bit buffer */
756 b = bb;
757 k = bk;
758 w = wp;
761 /* read in table lengths */
762 NEEDBITS(5)
763 nl = 257 + ((unsigned)b & 0x1f); /* number of literal/length codes */
764 DUMPBITS(5)
765 NEEDBITS(5)
766 nd = 1 + ((unsigned)b & 0x1f); /* number of distance codes */
767 DUMPBITS(5)
768 NEEDBITS(4)
769 nb = 4 + ((unsigned)b & 0xf); /* number of bit length codes */
770 DUMPBITS(4)
771 #ifdef PKZIP_BUG_WORKAROUND
772 if (nl > 288 || nd > 32)
773 #else
774 if (nl > 286 || nd > 30)
775 #endif
776 return 1; /* bad lengths */
779 /* read in bit-length-code lengths */
780 for (j = 0; j < nb; j++)
782 NEEDBITS(3)
783 ll[border[j]] = (unsigned)b & 7;
784 DUMPBITS(3)
786 for (; j < 19; j++)
787 ll[border[j]] = 0;
790 /* build decoding table for trees--single level, 7 bit lookup */
791 bl = 7;
792 if ((i = huft_build(ll, 19, 19, NULL, NULL, &tl, &bl)) != 0)
794 if (i == 1)
795 huft_free(tl);
796 return i; /* incomplete code set */
799 if (tl == NULL) /* Grrrhhh */
800 return 2;
802 /* read in literal and distance code lengths */
803 n = nl + nd;
804 m = mask_bits[bl];
805 i = l = 0;
806 while ((unsigned)i < n)
808 NEEDBITS((unsigned)bl)
809 j = (td = tl + ((unsigned)b & m))->b;
810 DUMPBITS(j)
811 j = td->v.n;
812 if (j < 16) /* length of code in bits (0..15) */
813 ll[i++] = l = j; /* save last length in l */
814 else if (j == 16) /* repeat last length 3 to 6 times */
816 NEEDBITS(2)
817 j = 3 + ((unsigned)b & 3);
818 DUMPBITS(2)
819 if ((unsigned)i + j > n)
820 return 1;
821 while (j--)
822 ll[i++] = l;
824 else if (j == 17) /* 3 to 10 zero length codes */
826 NEEDBITS(3)
827 j = 3 + ((unsigned)b & 7);
828 DUMPBITS(3)
829 if ((unsigned)i + j > n)
830 return 1;
831 while (j--)
832 ll[i++] = 0;
833 l = 0;
835 else /* j == 18: 11 to 138 zero length codes */
837 NEEDBITS(7)
838 j = 11 + ((unsigned)b & 0x7f);
839 DUMPBITS(7)
840 if ((unsigned)i + j > n)
841 return 1;
842 while (j--)
843 ll[i++] = 0;
844 l = 0;
849 /* free decoding table for trees */
850 huft_free(tl);
853 /* restore the global bit buffer */
854 bb = b;
855 bk = k;
858 /* build the decoding tables for literal/length and distance codes */
859 bl = lbits;
860 if ((i = huft_build(ll, nl, 257, cplens, cplext, &tl, &bl)) != 0)
862 if (i == 1) {
863 Trace ((stderr, " incomplete literal tree\n"));
864 huft_free(tl);
866 return i; /* incomplete code set */
868 bd = dbits;
869 if ((i = huft_build(ll + nl, nd, 0, cpdist, cpdext, &td, &bd)) != 0)
871 if (i == 1) {
872 Trace ((stderr, " incomplete distance tree\n"));
873 #ifdef PKZIP_BUG_WORKAROUND
874 i = 0;
876 #else
877 huft_free(td);
879 huft_free(tl);
880 return i; /* incomplete code set */
881 #endif
886 /* decompress until an end-of-block code */
887 int err = inflate_codes(tl, td, bl, bd) ? 1 : 0;
889 /* free the decoding tables */
890 huft_free(tl);
891 huft_free(td);
893 return err;
899 int inflate_block(e)
900 int *e; /* last block flag */
901 /* decompress an inflated block */
903 unsigned t; /* block type */
904 unsigned w; /* current window position */
905 register ulg b; /* bit buffer */
906 register unsigned k; /* number of bits in bit buffer */
909 /* make local bit buffer */
910 b = bb;
911 k = bk;
912 w = wp;
915 /* read in last block bit */
916 NEEDBITS(1)
917 *e = (int)b & 1;
918 DUMPBITS(1)
921 /* read in block type */
922 NEEDBITS(2)
923 t = (unsigned)b & 3;
924 DUMPBITS(2)
927 /* restore the global bit buffer */
928 bb = b;
929 bk = k;
932 /* inflate that block type */
933 if (t == 2)
934 return inflate_dynamic();
935 if (t == 0)
936 return inflate_stored();
937 if (t == 1)
938 return inflate_fixed();
941 /* bad block type */
942 return 2;
947 int inflate()
948 /* decompress an inflated entry */
950 int e; /* last block flag */
951 int r; /* result code */
952 unsigned h; /* maximum struct huft's malloc'ed */
955 /* initialize window, bit buffer */
956 wp = 0;
957 bk = 0;
958 bb = 0;
961 /* decompress until the last block */
962 h = 0;
963 do {
964 hufts = 0;
965 if ((r = inflate_block(&e)) != 0)
966 return r;
967 if (hufts > h)
968 h = hufts;
969 } while (!e);
971 /* Undo too much lookahead. The next read will be byte aligned so we
972 * can discard unused bits in the last meaningful byte.
974 while (bk >= 8) {
975 bk -= 8;
976 inptr--;
979 /* flush out slide */
980 flush_output(wp);
983 /* return success */
984 Trace ((stderr, "<%u> ", h));
985 return 0;