7 This directory contains bunch of files to test handling of .xz files
8 in .xz decoder implementations. Many of the files have been created
9 by hand with a hex editor, thus there is no better "source code" than
10 the files themselves. All the test files (*.xz) and this README have
11 been put into the public domain.
16 Good files (good-*.xz) must decode successfully without requiring
17 a lot of CPU time or RAM.
19 Unsupported files (unsupported-*.xz) are good files, but headers
20 indicate features not supported by the current file format
23 Bad files (bad-*.xz) must cause the decoder to give an error. Like
24 with the good files, these files must not require a lot of CPU time
25 or RAM before they get detected to be broken.
28 2. Descriptions of Individual Files
32 good-0-empty.xz has one Stream with no Blocks.
34 good-0pad-empty.xz has one Stream with no Blocks followed by
35 four-byte Stream Padding.
37 good-0cat-empty.xz has two zero-Block Streams concatenated without
40 good-0catpad-empty.xz has two zero-Block Streams concatenated with
41 four-byte Stream Padding between the Streams.
43 good-1-check-none.xz has one Stream with one Block with two
44 uncompressed LZMA2 chunks and no integrity check.
46 good-1-check-crc32.xz has one Stream with one Block with two
47 uncompressed LZMA2 chunks and CRC32 check.
49 good-1-check-crc64.xz is like good-1-check-crc32.xz but with CRC64.
51 good-1-check-sha256.xz is like good-1-check-crc32.xz but with
54 good-2-lzma2.xz has one Stream with two Blocks with one uncompressed
55 LZMA2 chunk in each Block.
57 good-1-block_header-1.xz has both Compressed Size and Uncompressed
58 Size in the Block Header. This has also four extra bytes of Header
61 good-1-block_header-2.xz has known Compressed Size.
63 good-1-block_header-3.xz has known Uncompressed Size.
65 good-1-delta-lzma2.tiff.xz is an image file that compresses
66 better with Delta+LZMA2 than with plain LZMA2.
68 good-1-x86-lzma2.xz uses the x86 filter (BCJ) and LZMA2. The
69 uncompressed file is compress_prepared_bcj_x86 found from the tests
72 good-1-sparc-lzma2.xz uses the SPARC filter and LZMA. The
73 uncompressed file is compress_prepared_bcj_sparc found from the tests
76 good-1-lzma2-1.xz has two LZMA2 chunks, of which the second sets
79 good-1-lzma2-2.xz has two LZMA2 chunks, of which the second resets
80 the state without specifying new properties.
82 good-1-lzma2-3.xz has two LZMA2 chunks, of which the first is
83 uncompressed and the second is LZMA. The first chunk resets dictionary
84 and the second sets new properties.
86 good-1-lzma2-4.xz has three LZMA2 chunks: First is LZMA, second is
87 uncompressed with dictionary reset, and third is LZMA with new
88 properties but without dictionary reset.
90 good-1-lzma2-5.xz has an empty LZMA2 stream with only the end of
91 payload marker. XZ Utils 5.0.1 and older incorrectly see this file
94 good-1-3delta-lzma2.xz has three Delta filters and LZMA2.
97 2.2. Unsupported Files
99 unsupported-check.xz uses Check ID 0x02 which isn't supported by
100 the current version of the file format. It is implementation-defined
101 how this file handled (it may reject it, or decode it possibly with
104 unsupported-block_header.xz has a non-null byte in Header Padding,
105 which may indicate presence of a new unsupported field.
107 unsupported-filter_flags-1.xz has unsupported Filter ID 0x7F.
109 unsupported-filter_flags-2.xz specifies only Delta filter in the
110 List of Filter Flags, but Delta isn't allowed as the last filter in
111 the chain. It could be a little more correct to detect this file as
112 corrupt instead of unsupported, but saying it is unsupported is
113 simpler in case of liblzma.
115 unsupported-filter_flags-3.xz specifies two LZMA2 filters in the
116 List of Filter Flags. LZMA2 is allowed only as the last filter in the
117 chain. It could be a little more correct to detect this file as
118 corrupt instead of unsupported, but saying it is unsupported is
119 simpler in case of liblzma.
124 bad-0pad-empty.xz has one Stream with no Blocks followed by
125 five-byte Stream Padding. Stream Padding must be a multiple of four
126 bytes, thus this file is corrupt.
128 bad-0catpad-empty.xz has two zero-Block Streams concatenated with
129 five-byte Stream Padding between the Streams.
131 bad-0cat-alone.xz is good-0-empty.xz concatenated with an empty
134 bad-0cat-header_magic.xz is good-0cat-empty.xz but with one byte
135 wrong in the Header Magic Bytes field of the second Stream. liblzma
136 gives LZMA_DATA_ERROR for this. (LZMA_FORMAT_ERROR is used only if
137 the first Stream of a file has invalid Header Magic Bytes.)
139 bad-0-header_magic.xz is good-0-empty.xz but with one byte wrong
140 in the Header Magic Bytes field. liblzma gives LZMA_FORMAT_ERROR for
143 bad-0-footer_magic.xz is good-0-empty.xz but with one byte wrong
144 in the Footer Magic Bytes field. liblzma gives LZMA_DATA_ERROR for
147 bad-0-empty-truncated.xz is good-0-empty.xz without the last byte
150 bad-0-nonempty_index.xz has no Blocks but Index claims that there is
153 bad-0-backward_size.xz has wrong Backward Size in Stream Footer.
155 bad-1-stream_flags-1.xz has different Stream Flags in Stream Header
158 bad-1-stream_flags-2.xz has wrong CRC32 in Stream Header.
160 bad-1-stream_flags-3.xz has wrong CRC32 in Stream Footer.
162 bad-1-vli-1.xz has two-byte variable-length integer in the
163 Uncompressed Size field in Block Header while one-byte would be enough
164 for that value. It's important that the file gets rejected due to too
165 big integer encoding instead of due to Uncompressed Size not matching
166 the value stored in the Block Header. That is, the decoder must not
167 try to decode the Compressed Data field.
169 bad-1-vli-2.xz has ten-byte variable-length integer as Uncompressed
170 Size in Block Header. It's important that the file gets rejected due
171 to too big integer encoding instead of due to Uncompressed Size not
172 matching the value stored in the Block Header. That is, the decoder
173 must not try to decode the Compressed Data field.
175 bad-1-block_header-1.xz has Block Header that ends in the middle of
176 the Filter Flags field.
178 bad-1-block_header-2.xz has Block Header that has Compressed Size and
179 Uncompressed Size but no List of Filter Flags field.
181 bad-1-block_header-3.xz has wrong CRC32 in Block Header.
183 bad-1-block_header-4.xz has too big Compressed Size in Block Header
184 (2^63 - 1 bytes while maximum is a little less, because the whole
185 Block must stay smaller than 2^63). It's important that the file
186 gets rejected due to invalid Compressed Size value; the decoder
187 must not try decoding the Compressed Data field.
189 bad-1-block_header-5.xz has zero as Compressed Size in Block Header.
191 bad-1-block_header-6.xz has corrupt Block Header which may crash
192 xz -lvv in XZ Utils 5.0.3 and earlier. It was fixed in the commit
193 c0297445064951807803457dca1611b3c47e7f0f.
195 bad-2-index-1.xz has wrong Unpadded Sizes in Index.
197 bad-2-index-2.xz has wrong Uncompressed Sizes in Index.
199 bad-2-index-3.xz has non-null byte in Index Padding.
201 bad-2-index-4.xz wrong CRC32 in Index.
203 bad-2-index-5.xz has zero as Unpadded Size. It is important that the
204 file gets rejected specifically due to Unpadded Size having an invalid
207 bad-2-compressed_data_padding.xz has non-null byte in the padding of
208 the Compressed Data field of the first Block.
210 bad-1-check-crc32.xz has wrong Check (CRC32).
212 bad-1-check-crc64.xz has wrong Check (CRC64).
214 bad-1-check-sha256.xz has wrong Check (SHA-256).
216 bad-1-lzma2-1.xz has LZMA2 stream whose first chunk (uncompressed)
217 doesn't reset the dictionary.
219 bad-1-lzma2-2.xz has two LZMA2 chunks, of which the second chunk
220 indicates dictionary reset, but the LZMA compressed data tries to
221 repeat data from the previous chunk.
223 bad-1-lzma2-3.xz sets new invalid properties (lc=8, lp=0, pb=0) in
226 bad-1-lzma2-4.xz has two LZMA2 chunks, of which the first is
227 uncompressed and the second is LZMA. The first chunk resets dictionary
228 as it should, but the second chunk tries to reset state without
229 specifying properties for LZMA.
231 bad-1-lzma2-5.xz is like bad-1-lzma2-4.xz but doesn't try to reset
232 anything in the header of the second chunk.
234 bad-1-lzma2-6.xz has reserved LZMA2 control byte value (0x03).
236 bad-1-lzma2-7.xz has EOPM at LZMA level.
238 bad-1-lzma2-8.xz is like good-1-lzma2-4.xz but doesn't set new
239 properties in the third LZMA2 chunk.