6 0.1. Notices and Acknowledgements
11 1.1.2. Dictionary Size
12 1.1.3. Uncompressed Size
13 1.2. LZMA Compressed Data
19 This document describes the .lzma file format, which is
20 sometimes also called LZMA_Alone format. It is a legacy file
21 format, which is being or has been replaced by the .xz format.
22 The MIME type of the .lzma format is `application/x-lzma'.
24 The most commonly used software to handle .lzma files are
25 LZMA SDK, LZMA Utils, 7-Zip, and XZ Utils. This document
26 describes some of the differences between these implementations
27 and gives hints what subset of the .lzma format is the most
31 0.1. Notices and Acknowledgements
33 This file format was designed by Igor Pavlov for use in
34 LZMA SDK. This document was written by Lasse Collin
35 <lasse.collin@tukaani.org> using the documentation found
38 This document has been put into the public domain.
43 Last modified: 2009-05-01 11:15+0300
48 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
49 | Header | LZMA Compressed Data |
50 +-+-+-+-+-+-+-+-+-+-+-+-+-+==========================+
52 The .lzma format file consist of 13-byte Header followed by
53 the LZMA Compressed Data.
55 Unlike the .gz, .bz2, and .xz formats, it is not possible to
56 concatenate multiple .lzma files as is and expect the
57 decompression tool to decode the resulting file as if it were
60 For example, the command line tools from LZMA Utils and
61 LZMA SDK silently ignore all the data after the first .lzma
62 stream. In contrast, the command line tool from XZ Utils
63 considers the .lzma file to be corrupt if there is data after
64 the first .lzma stream.
69 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
70 | Properties | Dictionary Size | Uncompressed Size |
71 +------------+----+----+----+----+--+--+--+--+--+--+--+--+
76 The Properties field contains three properties. An abbreviation
77 is given in parentheses, followed by the value range of the
78 property. The field consists of
80 1) the number of literal context bits (lc, [0, 8]);
81 2) the number of literal position bits (lp, [0, 4]); and
82 3) the number of position bits (pb, [0, 4]).
84 The properties are encoded using the following formula:
86 Properties = (pb * 5 + lp) * 9 + lc
88 The following C code illustrates a straightforward way to
89 decode the Properties field:
92 uint8_t prop = get_lzma_properties();
93 if (prop > (4 * 5 + 4) * 9 + 8)
94 return LZMA_PROPERTIES_ERROR;
101 XZ Utils has an additional requirement: lc + lp <= 4. Files
102 which don't follow this requirement cannot be decompressed
103 with XZ Utils. Usually this isn't a problem since the most
104 common lc/lp/pb values are 3/0/2. It is the only lc/lp/pb
105 combination that the files created by LZMA Utils can have,
106 but LZMA Utils can decompress files with any lc/lp/pb.
109 1.1.2. Dictionary Size
111 Dictionary Size is stored as an unsigned 32-bit little endian
112 integer. Any 32-bit value is possible, but for maximum
113 portability, only sizes of 2^n and 2^n + 2^(n-1) should be
116 LZMA Utils creates only files with dictionary size 2^n,
117 16 <= n <= 25. LZMA Utils can decompress files with any
120 XZ Utils creates and decompresses .lzma files only with
121 dictionary sizes 2^n and 2^n + 2^(n-1). If some other
122 dictionary size is specified when compressing, the value
123 stored in the Dictionary Size field is a rounded up, but the
124 specified value is still used in the actual compression code.
127 1.1.3. Uncompressed Size
129 Uncompressed Size is stored as unsigned 64-bit little endian
130 integer. A special value of 0xFFFF_FFFF_FFFF_FFFF indicates
131 that Uncompressed Size is unknown. End of Payload Marker (*)
132 is used if and only if Uncompressed Size is unknown.
134 XZ Utils rejects files whose Uncompressed Size field specifies
135 a known size that is 256 GiB or more. This is to reject false
136 positives when trying to guess if the input file is in the
137 .lzma format. When Uncompressed Size is unknown, there is no
138 limit for the uncompressed size of the file.
140 (*) Some tools use the term End of Stream (EOS) marker
141 instead of End of Payload Marker.
144 1.2. LZMA Compressed Data
146 Detailed description of the format of this field is out of
147 scope of this document.
152 LZMA SDK - The original LZMA implementation
153 http://7-zip.org/sdk.html
158 LZMA Utils - LZMA adapted to POSIX-like systems
159 http://tukaani.org/lzma/
161 XZ Utils - The next generation of LZMA Utils
162 http://tukaani.org/xz/
164 The .xz file format - The successor of the the .lzma format
165 http://tukaani.org/xz/xz-file-format.txt