3 Deark is a command-line utility that can decode certain types of files, and
6 1. convert them to a more-modern or more-readable format; or
7 2. extract embedded files from them
9 The files it writes are usually named "output.*".
11 This program is still being developed, and its features are subject to change
14 For additional information, see the [technical.md](technical.md) file.
18 deark [options] [-file] <input-file> [options]
19 deark <-h|-version|-modules>
24 The "module" to use to process the input file. The default is to autodetect.
25 A module may represent one file format, or a group of related formats, or
26 may have some special purpose.
27 See formats.txt for a list of modules. You usually don't need to use -m,
28 unless the format can't be detected, or you want to use a special-purpose
29 module such as "copy". See also the -onlydetect option.
31 Don't extract, but list the files that would be extracted.
32 This option is not necessarily very efficient. Deark will still go through
33 all the motions of extracting the files, but will not actually write them.
35 Extract only "primary" files (e.g. not thumbnail images).
37 Extract only "auxiliary" files, such as thumbnail images.
39 Extract more data than usual, including things that are rarely of interest,
40 such as comments. See also the "-opt extract..." options.
41 Note that, as a general rule, Deark doesn't extract the same data twice.
42 In rare cases, the -a option can actually *prevent* it from extracting
43 certain data, because it may now, for example, extract a block of Exif
44 data, instead of drilling down and extracting the thumbnail image within
47 Output filenames begin with this string. This can include a directory
48 path. Default is "output", except in some cases when using -zip/-tar.
50 Use exactly this filename for the first (and presumably only) output file.
51 The "-maxfiles 1" option is enabled automatically. Including the -main
54 "Keep" the input filename, and use it as the initial part of the output
55 filename(s). Incompatible with -o.
56 -k: Use only the base filename.
57 -k2: Use the full path, but not as an actual path.
58 -k3: Use the full path, as-is.
60 The directory in which to write output files. The directory must exist.
61 This affects only files that Deark writes directly, not e.g. the names of
62 ZIP member files when using -zip.
64 Do not overwrite existing output files.
66 This is an alternate syntax for specifying the primary input file. It works
67 even if the filename begins with "-".
69 Some formats are composed of more than one file. In some cases, you can
70 use the -file2 option to specify the secondary file. Refer to the
71 formats.txt file for details.
73 Write output files to a .zip file, instead of to individual files.
74 If the input format is an "archive" format (e.g. "ar" or "zoo"), then
75 by default, the filenames in the ZIP archive might not include the usual
78 Write output files to a .tar file, instead of to individual files.
79 Similar to -zip, but may work better with large files.
80 The -tostdout option is not currently supported when using -tar.
81 -ta <filename> (alias: -arcfn)
82 When using -zip/-tar, use this name for the output file. Default is
83 "output.zip" or "output.tar".
85 When uzing -zip/-tar, "keep" the input filename, and use it as the initial
86 part of the archive output filename. A suitable filename extenson like
87 ".zip" will be appended. Incompatible with -arcfn.
88 -ka: Use only the base filename.
89 -ka2: Use the full path, but not as an actual path.
90 -ka3: Use the full path, as-is.
91 -extrlist <filename>
92 Also create a text file containing a list of the names of the extracted
93 files. Format is UTF-8, no BOM, LF terminators. To append to the file
94 instead of overwriting, use with "-opt extrlist:append".
96 Write the output file(s) to the standard output stream (stdout).
97 It is recommended to put -tostdout early on the command line. The
98 -msgstostderr option is enabled automatically.
99 If used with -zip: Write the ZIP file to standard output.
100 Otherwise: The "-maxfiles 1" option is enabled automatically. Including the
101 -main option is recommended.
103 Read the input file from the standard input stream (stdin).
104 If you use -fromstdin, supplying an input filename is optional. If it is
105 supplied, the file will not be read (and need not exist), but the name
106 might be used to help guess the file format.
107 This option might not be very efficient, and might not work with extremely
110 Pretend that the input file starts at byte offset <n>.
112 Pretend that the input file contains only (up to) <n> bytes.
114 Don't extract the first <n> files found.
116 Extract at most <n> files. The normal default is 1000, or effectively
117 unlimited if using -zip.
119 Extract only the file identifed by <n>. The first file is 0.
120 Equivalent to "-firstfile <n> -maxfiles 1".
121 To unconditionally show the file identifiers, use "-l -opt list:fileid".
123 Do not write a file larger than <n> bytes. The default is 10 GiB.
124 This is an "emergency brake". If the limit is exceeded, Deark will stop all
126 This setting is for physical output files, so if you use -zip/-tar, it
127 applies to the ZIP/tar file, not to the individual member files.
128 This option implicitly increases the -maxtotalsize setting to be at least
131 Do not write files totaling more than about <n> bytes. The default is
133 Currently, this feature is not implemented very precisely. The limit is only
134 checked when an output file is completed.
136 Allow image dimensions up to <n> pixels.
137 By default, Deark refuses to generate images with a dimension larger than
138 10000 pixels. You can use -maxdim to decrease or increase the limit.
139 Increase the limit at your own risk. Deark does not generate large images
140 efficiently. In practice, a large dimension will only work if the other
141 dimension is very small.
143 Include "padding" pixels/bits in the image output.
144 Some images have extra bits at the end of each row that are used for
145 alignment, and are not normally made visible.
146 This option is not implemented for all formats.
148 Do not add a BOM to UTF-8 output files generated or converted by Deark. Note
149 that if a BOM already exists in the source data, it will not necessarily be
152 Do not try to record the original aspect ratio and pixel density in output
155 When generating an HTML document, use ASCII encoding instead of UTF-8. This
156 does not change how a browser will render the file; it just makes it larger
157 and very slightly more portable.
159 Make Deark less likely to try to improve output filenames by using names
160 from the contents of the input file. The output filenames will be more
161 predictable, but less informative.
163 In some cases, mainly when reading archive formats, a last-modified
164 timestamp contained in an input file will be used to set the timestamp of an
165 output file written directly to your computer (or with -zip/-tar, of a
166 member file inside that file). Use -nomodtime to disable this.
167 This does not affect internal timestamps that may be maintained when Deark
168 converts an item to some other format (such as PNG or HTML).
169 -opt <module:option>=<value>
170 Module-specific and feature-specific options. See formats.txt.
171 Caution: Unrecognized or misspelled options will be silently ignored.
172 Options not specific to one format:
173 -opt font:charsperrow=<n>
174 The number of characters per row, when rendering a font to a bitmap
175 -opt font:tounicode=<0|1>
176 [Don't] Try to translate a font's codepoints to Unicode codepoints.
177 -opt char:output=<html|image>
178 The output format for character graphics (such as ANSI Art).
179 -opt char:charwidth=<8|9>
180 The VGA character cell width for character graphics, when the output
182 -opt archive:subdirs=0
183 When using -zip/-tar, disallow subdirectories (the "/" character) in
185 -opt archive:zipcmprlevel=<n>
186 When using -zip, the compression level to use, from 0 (none) to 9 (max).
187 -opt pngcmprlevel=<n>
188 When generating a PNG file, the compression level to use, from 0 (low)
190 -opt archive:timestamp=<n>
192 Make the -zip/-tar output reproducible, by not including modification
193 times that are not contained in the source file. (That is, don't use the
194 current time, or the source file's timestamp.) If you use "repro", the
195 times will be set to some arbitrary value. If you use "timestamp", the
196 times will be set to the value you supply, in Unix time format (the
197 number of seconds since the beginning of 1970).
198 -opt keepdirentries=<0|1>
199 Select whether an archive file's directory entries are ignored (0), or
200 "extracted" (1). For details, see the technical.md file.
201 -opt list:fileid=<0|1>
202 Select whether the -l (list) option also prints the numeric file
205 Affects the -extrlist option.
210 Extract the specified type of data to a file, instead of decoding it.
211 For more about the ".8bimtiff" and ".iptctiff" formats, see the
213 -opt atari:palbits=<9|12|15>
214 For some Atari image formats, the number of significant bits per
215 palette color. The default is to autodetect.
216 -opt macrsrc=<raw|as|ad|mbin>
217 The preferred way to extract Macintosh resource forks, and data files
218 associated with a non-empty resource fork.
219 raw = Write the raw resource fork to a separate .rsrc file.
220 ad = Put the resource fork in an AppleDouble container (default).
221 as = Put both forks in an AppleSingle container.
222 mbin = Put both forks in a MacBinary container.
223 For input files already in AppleDouble or AppleSingle format, see the
224 formats.txt file for more information.
225 -opt deflatecodec=native
226 Use Deark's native "Deflate" decompressor when possible, instead of
227 miniz. It is experimental and much slower, but could be useful for
228 debugging and educational purposes.
230 Stop after the format identification phase. This can be used to show what
231 module Deark will run, without actually running it.
233 Print the help message.
234 Use with -m to get help for a specific module. Use with a filename to get
235 help for the detected format of that file. Note that most modules have no
236 module-specific help to speak of.
238 Print the version number, and other version information.
240 Print the names of the available modules.
241 With -a, list all modules, including internal modules, and modules that
244 Suppress informational messages.
246 Suppress warning messages.
248 Suppress informational and warning messages.
250 Print technical and debugging information. -d2 and -d3 are more verbose.
252 Start each line printed by -d with this prefix. Default is "DEBUG: ".
253 -colormode <none|auto|ansi|ansi24|winconsole>
254 Control whether Deark uses color and similar features in its debug output.
255 Currently, this is mainly used to highlight unprintable characters, and
256 preview color palettes (usually requires -d2).
257 none: No color (default).
258 ansi: Use ANSI codes, but not the less-standard ones for 24-bit color.
259 ansi24: Use ANSI codes, including codes for 24-bit color. Works on most
260 Linux terminals, and on sufficiently new versions of Windows 10.
261 winconsole: Use Windows console commands. Works on all versions of Windows,
262 but does not support 24-bit color.
263 auto: Request color. Let Deark decide how to do it.
265 Same as "-colormode auto".
267 Set the encoding of the messages that are printed to the console. This does
268 not affect the extracted data files.
269 The default is to use Unicode (UTF-8, when the encoding is relevant).
270 ascii: Use ASCII characters only.
271 oem: [Windows only; has no effect on other platforms] Use the "OEM"
272 character set. This may be useful when paging the output with "|more".
274 [Windows only] Never change the console OEM code page (to UTF-8).
275 For technical reasons, Deark sometimes changes the code page of the Windows
276 console it is running in, when its output is going to a pipe or file.
277 -inenc <ascii|utf8|latin1|latin2|cp437|windows874|windows1250|windows1251|
278 windows1252|windows1253|windows1254|macroman|palm|riscos|atarist>
279 Supply a hint as to the encoding of the text contained in the input file.
280 This option is not supported by all formats, and may be ignored if the
281 encoding can be reliably determined by other means. Admittedly, it would be
282 nice if Deark knew more encodings than this.
284 Supply a hint as to the time zone used by timestamps contained in the input
286 Many file formats unfortunately contain timestamps in "local time", with no
287 information about their time zone. In such cases, the supplied -intz offset
288 will be used to convert the timestamp to UTC.
289 The "offset" parameter is in hours east of UTC. For example, New York City
290 is -5.0, or -4.0 when Daylight Saving Time is in effect.
291 This option does not respect Daylight Saving Time. It cannot deal with the
292 case where some of the timestamps in a file are in DST, and others are not.
294 Print all messages to stderr, instead of stdout. This option should be
295 placed early on the command line, as it might not affect messages
296 related to options that appear before it.
297 -nodetect <module1,module2,...>
298 -onlydetect <module1,module2,...>
299 Disable autodetection of the formats in the list (or for -onlydetect, the
300 formats *not* in the list).
301 -disablemods <module1,module2,...>
302 -onlymods <module1,module2,...>
303 Completely disable the main functionality, and the autodetection
304 functionality, of the modules in the list (or for -onlymods, the modules
305 *not* in the list). This can have unexpected side effects, because modules
306 often use other modules internally. These options exist mainly to help
307 address potential security-related concerns in some workflows.
309 Run the module in a non-default "mode".
310 The existence of this option (though not its details) is documented in the
311 interest of transparency, but it is mainly for developers, and to make it
312 possible to do things whose usefulness was not anticipated.
317 Deark sets the exit status to nonzero only if it wasn't able to do its job,
318 e.g. due to a read or write failure. A malformed input file usually does not
319 cause such an error, and the exit status will be zero even if an error message
322 However, all fatal errors result in a nonzero exit status, and in extreme cases
323 it is possible for the input file to cause a fatal error, due to certain
324 resource limits being exceeded.
328 Starting with version 1.4.x, Deark is distributed under an MIT-style license.
329 See the [COPYING](COPYING) file for the license text.
331 The main Deark license does not necessarily apply to the code in the "foreign"
332 subdirectory. Each file there may have its own licensing terms. In particular:
334 uncompface.h: Copyright (c) James Ashton - Sydney University - June 1990
335 (See the file foreign/readme-compface.txt for details.)
337 By necessity, Deark contains knowledge about how to decode various
338 third-party file formats. This knowledge includes data structures,
339 algorithms, tables, color palettes, etc. The author(s) of Deark make no
340 intellectual property claims to this essential knowledge, but they cannot
341 guarantee that no one else will attempt to do so.
343 Deark contains VGA and CGA bitmapped fonts, which have no known copyright
346 Be particularly wary of relying on Deark to decode archive and compression
347 formats (tar, ar, gzip, cpio, ...). For example, to decode tar format, you
348 really should use a battle-hardened application like GNU Tar, not Deark.
349 Deark's support for such formats is often incomplete, and it does not always
350 do integrity checking.
352 ## Feedback and contributions ##
354 (As of 2020-09.) Suggestions and bug reports are welcome. This can be done by
355 opening a GitHub issue, or by email. If you prefer to do it in the form of a
356 GitHub "pull request", that's fine too, but as a general rule, such requests
357 won't be merged directly.
359 Deark is not really a collaborative project at this time. Unsolicited
360 contributions of more than a few lines of code are unlikely to be accepted.
361 It's okay to offer them, but please don't do a lot of work with the
362 expectation that it will be accepted.
364 Any code copyrighted by someone other than the main Deark developer(s) is only
365 allowed in the "foreign" section of the project. Pointers to existing open
366 source format decoders, that might be useful in Deark, are welcome. However,
367 most such code will be rejected for one reason or another (incompatible
368 license, too large, too trivial, etc.).
372 See the [technical.md](technical.md) file.
374 ## Acknowledgements ##
376 Thanks to Rich Geldreich for the miniz library.
378 Thanks to James Ashton for much of the code used by the X-Face format decoder.
380 Thanks to countless others who have documented the supported file formats.
384 Written by Jason Summers, 2014-2021.<br>
385 Copyright © 2016-2021 Jason Summers<br>
386 [https://entropymine.com/deark/](https://entropymine.com/deark/)<br>
387 [https://github.com/jsummers/deark](https://github.com/jsummers/deark)