3 @setfilename coreutils.info
4 @settitle @sc{gnu} Coreutils
9 @include constants.texi
11 @c Define new indices.
15 @c Put everything in one index (arbitrarily chosen to be the concept index).
25 * Coreutils: (coreutils). Core GNU utilities.
26 * Common options: (coreutils)Common options. Common options.
27 * File permissions: (coreutils)File permissions. Access modes.
28 * Date input formats: (coreutils)Date input formats.
31 @dircategory Individual utilities
33 * basename: (coreutils)basename invocation. Strip directory and suffix.
34 * cat: (coreutils)cat invocation. Concatenate and write files.
35 * chgrp: (coreutils)chgrp invocation. Change file groups.
36 * chmod: (coreutils)chmod invocation. Change file permissions.
37 * chown: (coreutils)chown invocation. Change file owners/groups.
38 * chroot: (coreutils)chroot invocation. Specify the root directory.
39 * cksum: (coreutils)cksum invocation. Print CRC checksum.
40 * comm: (coreutils)comm invocation. Compare sorted files by line.
41 * cp: (coreutils)cp invocation. Copy files.
42 * csplit: (coreutils)csplit invocation. Split by context.
43 * cut: (coreutils)cut invocation. Print selected parts of lines.
44 * date: (coreutils)date invocation. Print/set system date and time.
45 * dd: (coreutils)dd invocation. Copy and convert a file.
46 * df: (coreutils)df invocation. Report filesystem disk usage.
47 * dir: (coreutils)dir invocation. List directories briefly.
48 * dircolors: (coreutils)dircolors invocation. Color setup for ls.
49 * dirname: (coreutils)dirname invocation. Strip non-directory suffix.
50 * du: (coreutils)du invocation. Report on disk usage.
51 * echo: (coreutils)echo invocation. Print a line of text.
52 * env: (coreutils)env invocation. Modify the environment.
53 * expand: (coreutils)expand invocation. Convert tabs to spaces.
54 * expr: (coreutils)expr invocation. Evaluate expressions.
55 * factor: (coreutils)factor invocation. Print prime factors
56 * false: (coreutils)false invocation. Do nothing, unsuccessfully.
57 * fmt: (coreutils)fmt invocation. Reformat paragraph text.
58 * fold: (coreutils)fold invocation. Wrap long input lines.
59 * groups: (coreutils)groups invocation. Print group names a user is in.
60 * head: (coreutils)head invocation. Output the first part of files.
61 * hostid: (coreutils)hostid invocation. Print numeric host identifier.
62 * hostname: (coreutils)hostname invocation. Print or set system name.
63 * id: (coreutils)id invocation. Print real/effective uid/gid.
64 * install: (coreutils)install invocation. Copy and change attributes.
65 * join: (coreutils)join invocation. Join lines on a common field.
66 * kill: (coreutils)kill invocation. Send a signal to processes.
67 * link: (coreutils)link invocation. Make hard links between files.
68 * ln: (coreutils)ln invocation. Make links between files.
69 * logname: (coreutils)logname invocation. Print current login name.
70 * ls: (coreutils)ls invocation. List directory contents.
71 * md5sum: (coreutils)md5sum invocation. Print or check message-digests.
72 * mkdir: (coreutils)mkdir invocation. Create directories.
73 * mkfifo: (coreutils)mkfifo invocation. Create FIFOs (named pipes).
74 * mknod: (coreutils)mknod invocation. Create special files.
75 * mv: (coreutils)mv invocation. Rename files.
76 * nice: (coreutils)nice invocation. Modify scheduling priority.
77 * nl: (coreutils)nl invocation. Number lines and write files.
78 * nohup: (coreutils)nohup invocation. Immunize to hangups.
79 * od: (coreutils)od invocation. Dump files in octal, etc.
80 * paste: (coreutils)paste invocation. Merge lines of files.
81 * pathchk: (coreutils)pathchk invocation. Check file name portability.
82 * pr: (coreutils)pr invocation. Paginate or columnate files.
83 * printenv: (coreutils)printenv invocation. Print environment variables.
84 * printf: (coreutils)printf invocation. Format and print data.
85 * ptx: (coreutils)ptx invocation. Produce permuted indexes.
86 * pwd: (coreutils)pwd invocation. Print working directory.
87 * readlink: (coreutils)readlink invocation. Print referent of a symlink.
88 * rm: (coreutils)rm invocation. Remove files.
89 * rmdir: (coreutils)rmdir invocation. Remove empty directories.
90 * seq: (coreutils)seq invocation. Print numeric sequences
91 * shred: (coreutils)shred invocation. Remove files more securely.
92 * sleep: (coreutils)sleep invocation. Delay for a specified time.
93 * sort: (coreutils)sort invocation. Sort text files.
94 * split: (coreutils)split invocation. Split into fixed-size pieces.
95 * stat: (coreutils)stat invocation. Report file(system) status.
96 * stty: (coreutils)stty invocation. Print/change terminal settings.
97 * su: (coreutils)su invocation. Modify user and group id.
98 * sum: (coreutils)sum invocation. Print traditional checksum.
99 * sync: (coreutils)sync invocation. Synchronize memory and disk.
100 * tac: (coreutils)tac invocation. Reverse files.
101 * tail: (coreutils)tail invocation. Output the last part of files.
102 * tee: (coreutils)tee invocation. Redirect to multiple files.
103 * test: (coreutils)test invocation. File/string tests.
104 * touch: (coreutils)touch invocation. Change file timestamps.
105 * tr: (coreutils)tr invocation. Translate characters.
106 * true: (coreutils)true invocation. Do nothing, successfully.
107 * tsort: (coreutils)tsort invocation. Topological sort.
108 * tty: (coreutils)tty invocation. Print terminal name.
109 * uname: (coreutils)uname invocation. Print system information.
110 * unexpand: (coreutils)unexpand invocation. Convert spaces to tabs.
111 * uniq: (coreutils)uniq invocation. Uniquify files.
112 * unlink: (coreutils)unlink invocation. Removal via unlink(2).
113 * users: (coreutils)users invocation. Print current user names.
114 * vdir: (coreutils)vdir invocation. List directories verbosely.
115 * wc: (coreutils)wc invocation. Byte, word, and line counts.
116 * who: (coreutils)who invocation. Print who is logged in.
117 * whoami: (coreutils)whoami invocation. Print effective user id.
118 * yes: (coreutils)yes invocation. Print a string indefinitely.
122 This manual documents version @value{VERSION} of the @sc{gnu} core
123 utilities, including the standard programs for text and file manipulation.
125 Copyright @copyright{} 1994, 1995, 1996, 2000, 2001, 2002, 2003
126 Free Software Foundation, Inc.
129 Permission is granted to copy, distribute and/or modify this document
130 under the terms of the GNU Free Documentation License, Version 1.1 or
131 any later version published by the Free Software Foundation; with no
132 Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
133 Texts. A copy of the license is included in the section entitled ``GNU
134 Free Documentation License''.
139 @title @sc{gnu} @code{Coreutils}
140 @subtitle Core GNU utilities
141 @subtitle for version @value{VERSION}, @value{UPDATED}
142 @author David MacKenzie et al.
145 @vskip 0pt plus 1filll
157 @cindex core utilities
158 @cindex text utilities
159 @cindex shell utilities
160 @cindex file utilities
163 * Introduction:: Caveats, overview, and authors.
164 * Common options:: Common options.
165 * Output of entire files:: cat tac nl od
166 * Formatting file contents:: fmt pr fold
167 * Output of parts of files:: head tail split csplit
168 * Summarizing files:: wc sum cksum md5sum
169 * Operating on sorted files:: sort uniq comm ptx tsort
170 * Operating on fields within a line:: cut paste join
171 * Operating on characters:: tr expand unexpand
172 * Directory listing:: ls dir vdir d v dircolors
173 * Basic operations:: cp dd install mv rm shred
174 * Special file types:: ln mkdir rmdir mkfifo mknod
175 * Changing file attributes:: chgrp chmod chown touch
176 * Disk usage:: df du stat sync
177 * Printing text:: echo printf yes
178 * Conditions:: false true test expr
180 * File name manipulation:: dirname basename pathchk
181 * Working context:: pwd stty printenv tty
182 * User information:: id logname whoami groups users who
183 * System context:: date uname hostname
184 * Modified command invocation:: chroot env nice nohup su
185 * Process control:: kill
187 * Numeric operations:: factor seq
188 * File permissions:: Access modes.
189 * Date input formats:: Specifying date strings.
190 * Opening the software toolbox:: The software tools philosophy.
191 * GNU Free Documentation License:: The license for this documentation.
192 * Index:: General index.
195 --- The Detailed Node Listing ---
199 * Backup options:: Backup options
200 * Block size:: Block size
201 * Target directory:: Target directory
202 * Trailing slashes:: Trailing slashes
203 * Standards conformance:: Standards conformance
205 Output of entire files
207 * cat invocation:: Concatenate and write files.
208 * tac invocation:: Concatenate and write files in reverse.
209 * nl invocation:: Number lines and write files.
210 * od invocation:: Write files in octal or other formats.
212 Formatting file contents
214 * fmt invocation:: Reformat paragraph text.
215 * pr invocation:: Paginate or columnate files for printing.
216 * fold invocation:: Wrap input lines to fit in specified width.
218 Output of parts of files
220 * head invocation:: Output the first part of files.
221 * tail invocation:: Output the last part of files.
222 * split invocation:: Split a file into fixed-size pieces.
223 * csplit invocation:: Split a file into context-determined pieces.
227 * wc invocation:: Print byte, word, and line counts.
228 * sum invocation:: Print checksum and block counts.
229 * cksum invocation:: Print CRC checksum and byte counts.
230 * md5sum invocation:: Print or check message-digests.
232 Operating on sorted files
234 * sort invocation:: Sort text files.
235 * uniq invocation:: Uniquify files.
236 * comm invocation:: Compare two sorted files line by line.
237 * ptx invocation:: Produce a permuted index of file contents.
238 * tsort invocation:: Topological sort.
240 @command{ptx}: Produce permuted indexes
242 * General options in ptx:: Options which affect general program behavior.
243 * Charset selection in ptx:: Underlying character set considerations.
244 * Input processing in ptx:: Input fields, contexts, and keyword selection.
245 * Output formatting in ptx:: Types of output format, and sizing the fields.
246 * Compatibility in ptx:: The GNU extensions to @command{ptx}
248 Operating on fields within a line
250 * cut invocation:: Print selected parts of lines.
251 * paste invocation:: Merge lines of files.
252 * join invocation:: Join lines on a common field.
254 Operating on characters
256 * tr invocation:: Translate, squeeze, and/or delete characters.
257 * expand invocation:: Convert tabs to spaces.
258 * unexpand invocation:: Convert spaces to tabs.
260 @command{tr}: Translate, squeeze, and/or delete characters
262 * Character sets:: Specifying sets of characters.
263 * Translating:: Changing one characters to another.
264 * Squeezing:: Squeezing repeats and deleting.
265 * Warnings in tr:: Warning messages.
269 * ls invocation:: List directory contents
270 * dir invocation:: Briefly list directory contents
271 * vdir invocation:: Verbosely list directory contents
272 * dircolors invocation:: Color setup for @command{ls}
274 @command{ls}: List directory contents
276 * Which files are listed:: Which files are listed
277 * What information is listed:: What information is listed
278 * Sorting the output:: Sorting the output
279 * More details about version sort:: More details about version sort
280 * General output formatting:: General output formatting
281 * Formatting the file names:: Formatting the file names
285 * cp invocation:: Copy files and directories
286 * dd invocation:: Convert and copy a file
287 * install invocation:: Copy files and set attributes
288 * mv invocation:: Move (rename) files
289 * rm invocation:: Remove files or directories
290 * shred invocation:: Remove files more securely
294 * link invocation:: Make a hard link via the link syscall
295 * ln invocation:: Make links between files
296 * mkdir invocation:: Make directories
297 * mkfifo invocation:: Make FIFOs (named pipes)
298 * mknod invocation:: Make block or character special files
299 * readlink invocation:: Print the referent of a symbolic link
300 * rmdir invocation:: Remove empty directories
301 * unlink invocation:: Remove files via unlink syscall
303 Changing file attributes
305 * chown invocation:: Change file owner and group
306 * chgrp invocation:: Change group ownership
307 * chmod invocation:: Change access permissions
308 * touch invocation:: Change file timestamps
312 * df invocation:: Report filesystem disk space usage
313 * du invocation:: Estimate file space usage
314 * stat invocation:: Report file or filesystem status
315 * sync invocation:: Synchronize data on disk with memory
319 * echo invocation:: Print a line of text
320 * printf invocation:: Format and print data
321 * yes invocation:: Print a string until interrupted
325 * false invocation:: Do nothing, unsuccessfully
326 * true invocation:: Do nothing, successfully
327 * test invocation:: Check file types and compare values
328 * expr invocation:: Evaluate expressions
330 @command{test}: Check file types and compare values
332 * File type tests:: File type tests
333 * Access permission tests:: Access permission tests
334 * File characteristic tests:: File characteristic tests
335 * String tests:: String tests
336 * Numeric tests:: Numeric tests
338 @command{expr}: Evaluate expression
340 * String expressions:: + : match substr index length
341 * Numeric expressions:: + - * / %
342 * Relations for expr:: | & < <= = == != >= >
343 * Examples of expr:: Examples of using @command{expr}
347 * tee invocation:: Redirect output to multiple files
349 File name manipulation
351 * basename invocation:: Strip directory and suffix from a file name
352 * dirname invocation:: Strip non-directory suffix from a file name
353 * pathchk invocation:: Check file name portability
357 * pwd invocation:: Print working directory
358 * stty invocation:: Print or change terminal characteristics
359 * printenv invocation:: Print all or some environment variables
360 * tty invocation:: Print file name of terminal on standard input
362 @command{stty}: Print or change terminal characteristics
364 * Control:: Control settings
365 * Input:: Input settings
366 * Output:: Output settings
367 * Local:: Local settings
368 * Combination:: Combination settings
369 * Characters:: Special characters
370 * Special:: Special settings
374 * id invocation:: Print real and effective uid and gid
375 * logname invocation:: Print current login name
376 * whoami invocation:: Print effective user id
377 * groups invocation:: Print group names a user is in
378 * users invocation:: Print login names of users currently logged in
379 * who invocation:: Print who is currently logged in
383 * date invocation:: Print or set system date and time
384 * uname invocation:: Print system information
385 * hostname invocation:: Print or set system name
386 * hostid invocation:: Print numeric host identifier.
388 @command{date}: Print or set system date and time
390 * Time directives:: Time directives
391 * Date directives:: Date directives
392 * Literal directives:: Literal directives
394 * Setting the time:: Setting the time
395 * Options for date:: Options for @command{date}
396 * Examples of date:: Examples of @command{date}
398 Modified command invocation
400 * chroot invocation:: Run a command with a different root directory
401 * env invocation:: Run a command in a modified environment
402 * nice invocation:: Run a command with modified scheduling priority
403 * nohup invocation:: Run a command immune to hangups
404 * su invocation:: Run a command with substitute user and group id
408 * kill invocation:: Sending a signal to processes.
412 * sleep invocation:: Delay for a specified time
416 * factor invocation:: Print prime factors
417 * seq invocation:: Print numeric sequences
421 * Mode Structure:: Structure of File Permissions
422 * Symbolic Modes:: Mnemonic permissions representation
423 * Numeric Modes:: Permissions as octal numbers
427 * General date syntax: General date syntax
428 * Calendar date items: Calendar date items
429 * Time of day items: Time of day items
430 * Time zone items: Time zone items
431 * Day of week items: Day of week items
432 * Relative items in date strings: Relative items in date strings
433 * Pure numbers in date strings: Pure numbers in date strings
434 * Authors of getdate: Authors of getdate
436 Opening the software toolbox
438 * Toolbox introduction:: Toolbox introduction
439 * I/O redirection:: I/O redirection
440 * The who command:: The @command{who} command
441 * The cut command:: The @command{cut} command
442 * The sort command:: The @command{sort} command
443 * The uniq command:: The @command{uniq} command
444 * Putting the tools together:: Putting the tools together
446 GNU Free Documentation License
448 * How to use this License for your documents::
455 @chapter Introduction
457 This manual is a work in progress: many sections make no attempt to explain
458 basic concepts in a way suitable for novices. Thus, if you are interested,
459 please get involved in improving this manual. The entire @sc{gnu} community
462 @cindex @acronym{POSIX}
463 The @sc{gnu} utilities documented here are mostly compatible with the
464 @acronym{POSIX} standard.
465 @cindex bugs, reporting
466 Please report bugs to @email{bug-coreutils@@gnu.org}. Remember
467 to include the version number, machine architecture, input files, and
468 any other information needed to reproduce the bug: your input, what you
469 expected, what you got, and why it is wrong. Diffs are welcome, but
470 please include a description of the problem as well, since this is
471 sometimes difficult to infer. @xref{Bugs, , , gcc, Using and Porting GNU CC}.
477 @cindex MacKenzie, D.
480 This manual was originally derived from the Unix man pages in the
481 distributions, which were written by David MacKenzie and updated by Jim
482 Meyering. What you are reading now is the authoritative documentation
483 for these utilities; the man pages are no longer being maintained. The
484 original @command{fmt} man page was written by Ross Paterson. Fran@,{c}ois
485 Pinard did the initial conversion to Texinfo format. Karl Berry did the
486 indexing, some reorganization, and editing of the results. Brian
487 Youmans of the Free Software Foundation office staff combined the
488 manuals for textutils, fileutils, and sh-utils to produce the present
489 omnibus manual. Richard Stallman contributed his usual invaluable
490 insights to the overall process.
493 @chapter Common options
495 @cindex common options
497 Certain options are available in all of these programs. Rather than
498 writing identical descriptions for each of the programs, they are
499 described here. (In fact, every @sc{gnu} program accepts (or should accept)
502 @vindex POSIXLY_CORRECT
503 Normally options and operands can appear in any order, and programs act
504 as if all the options appear before any operands. For example,
505 @samp{sort -r passwd -t :} acts like @samp{sort -r -t : passwd}, since
506 @samp{:} is an option-argument of @option{-t}. However, if the
507 @env{POSIXLY_CORRECT} environment variable is set, options must appear
508 before operands, unless otherwise specified for a particular command.
510 Some of these programs recognize the @option{--help} and @option{--version}
511 options only when one of them is the sole command line argument.
518 Print a usage message listing all available options, then exit successfully.
522 @cindex version number, finding
523 Print the version number, then exit successfully.
527 @cindex option delimiter
528 Delimit the option list. Later arguments, if any, are treated as
529 operands even if they begin with @samp{-}. For example, @samp{sort --
530 -r} reads from the file named @file{-r}.
534 @cindex standard input
535 @cindex standard output
536 A single @samp{-} is not really an option, though it looks like one. It
537 stands for standard input, or for standard output if that is clear from
538 the context, and it can be used either as an operand or as an
539 option-argument. For example, @samp{sort -o - -} outputs to standard
540 output and reads from standard input, and is equivalent to plain
541 @samp{sort}. Unless otherwise specified, @samp{-} can appear in any
542 context that requires a file name.
545 * Backup options:: -b -S -V, in some programs.
546 * Block size:: BLOCK_SIZE and --block-size, in some programs.
547 * Target directory:: --target-directory, in some programs.
548 * Trailing slashes:: --strip-trailing-slashes, in some programs.
549 * Standards conformance:: Conformance to the @acronym{POSIX} standard.
554 @section Backup options
556 @cindex backup options
558 Some @sc{gnu} programs (at least @command{cp}, @code{install}, @code{ln}, and
559 @command{mv}) optionally make backups of files before writing new versions.
560 These options control the details of these backups. The options are also
561 briefly mentioned in the descriptions of the particular programs.
566 @itemx @w{@kbd{--backup}[=@var{method}]}
569 @vindex VERSION_CONTROL
570 @cindex backups, making
571 Make a backup of each file that would otherwise be overwritten or removed.
572 Without this option, the original versions are destroyed.
573 Use @var{method} to determine the type of backups to make.
574 When this option is used but @var{method} is not specified,
575 then the value of the @env{VERSION_CONTROL}
576 environment variable is used. And if @env{VERSION_CONTROL} is not set,
577 the default backup type is @samp{existing}.
579 Note that the short form of this option, @option{-b} does not accept any
580 argument. Using @option{-b} is equivalent to using @option{--backup=existing}.
582 @vindex version-control @r{Emacs variable}
583 This option corresponds to the Emacs variable @samp{version-control};
584 the values for @var{method} are the same as those used in Emacs.
585 This option also accepts more descriptive names.
586 The valid @var{method}s are (unique abbreviations are accepted):
591 @opindex none @r{backup method}
596 @opindex numbered @r{backup method}
597 Always make numbered backups.
601 @opindex existing @r{backup method}
602 Make numbered backups of files that already have them, simple backups
607 @opindex simple @r{backup method}
608 Always make simple backups. Please note @samp{never} is not to be
609 confused with @samp{none}.
613 @item -S @var{suffix}
614 @itemx --suffix=@var{suffix}
617 @cindex backup suffix
618 @vindex SIMPLE_BACKUP_SUFFIX
619 Append @var{suffix} to each backup file made with @option{-b}. If this
620 option is not specified, the value of the @env{SIMPLE_BACKUP_SUFFIX}
621 environment variable is used. And if @env{SIMPLE_BACKUP_SUFFIX} is not
622 set, the default is @samp{~}, just as in Emacs.
624 @itemx --version-control=@var{method}
625 @opindex --version-control
626 @c FIXME: remove this block one or two releases after the actual
627 @c removal from the code.
628 This option is obsolete and will be removed in a future release.
629 It has been replaced with @w{@kbd{--backup}}.
638 Some @sc{gnu} programs (at least @command{df}, @code{du}, and
639 @code{ls}) display sizes in ``blocks''. You can adjust the block size
640 and method of display to make sizes easier to read. The block size
641 used for display is independent of any filesystem block size.
642 Fractional block counts are rounded up to the nearest integer.
644 @opindex --block-size=@var{size}
646 @vindex DF_BLOCK_SIZE
647 @vindex DU_BLOCK_SIZE
648 @vindex LS_BLOCK_SIZE
649 @vindex POSIXLY_CORRECT@r{, and block size}
651 The default block size is chosen by examining the following environment
652 variables in turn; the first one that is set determines the block size.
657 This specifies the default block size for the @command{df} command.
658 Similarly, @env{DU_BLOCK_SIZE} specifies the default for @command{du} and
659 @env{LS_BLOCK_SIZE} for @command{ls}.
662 This specifies the default block size for all three commands, if the
663 above command-specific environment variables are not set.
665 @item POSIXLY_CORRECT
666 If neither the @env{@var{command}_BLOCK_SIZE} nor the @env{BLOCK_SIZE}
667 variables are set, but this variable is set, the block size defaults to 512.
671 If none of the above environment variables are set, the block size
672 currently defaults to 1024 bytes in most contexts, but this number may
673 change in the future. For @command{ls} file sizes, the block size
676 @cindex human-readable output
679 A block size specification can be a positive integer specifying the number
680 of bytes per block, or it can be @code{human-readable} or @code{si} to
681 select a human-readable format. Integers may be followed by suffixes
682 that are upward compatible with the
683 @uref{http://www.bipm.fr/enus/3_SI/si-prefixes.html, SI prefixes}
684 for decimal multiples and with the
685 @uref{http://physics.nist.gov/cuu/Units/binary.html, IEC 60027-2
686 prefixes for binary multiples}.
688 With human-readable formats, output sizes are followed by a size letter
689 such as @samp{M} for megabytes. @code{BLOCK_SIZE=human-readable} uses
690 powers of 1024; @samp{M} stands for 1,048,576 bytes.
691 @code{BLOCK_SIZE=si} is similar, but uses powers of 1000 and appends
692 @samp{B}; @samp{MB} stands for 1,000,000 bytes.
695 A block size specification preceded by @samp{'} causes output sizes to
696 be displayed with thousands separators. The @env{LC_NUMERIC} locale
697 specifies the thousands separator and grouping. For example, in an
698 American English locale, @samp{--block-size="'1kB"} would cause a size
699 of 1234000 bytes to be displayed as @samp{1,234}. In the default C
700 locale, there is no thousands separator so a leading @samp{'} has no
703 An integer block size can be followed by a suffix to specify a
704 multiple of that size. A bare size letter,
705 or one followed by @samp{iB}, specifies
706 a multiple using powers of 1024. A size letter followed by @samp{B}
707 specifies powers of 1000 instead. For example, @samp{1M} and
708 @samp{1MiB} are equivalent to @samp{1048576}, whereas @samp{1MB} is
709 equivalent to @samp{1000000}.
711 A plain suffix without a preceding integer acts as if @samp{1} were
712 prepended, except that it causes a size indication to be appended to
713 the output. For example, @samp{--block-size="kB"} displays 3000 as
716 The following suffixes are defined. Large sizes like @code{1Y}
717 may be rejected by your computer due to limitations of its arithmetic.
721 @cindex kilobyte, definition of
722 kilobyte: @math{10^3 = 1000}.
726 @cindex kibibyte, definition of
727 kibibyte: @math{2^10 = 1024}. @samp{K} is special: the SI prefix is
728 @samp{k} and the IEC 60027-2 prefix is @samp{Ki}, but tradition and
729 @acronym{POSIX} use @samp{k} to mean @samp{KiB}.
731 @cindex megabyte, definition of
732 megabyte: @math{10^6 = 1,000,000}.
735 @cindex mebibyte, definition of
736 mebibyte: @math{2^20 = 1,048,576}.
738 @cindex gigabyte, definition of
739 gigabyte: @math{10^9 = 1,000,000,000}.
742 @cindex gibibyte, definition of
743 gibibyte: @math{2^30 = 1,073,741,824}.
745 @cindex terabyte, definition of
746 terabyte: @math{10^12 = 1,000,000,000,000}.
749 @cindex tebibyte, definition of
750 tebibyte: @math{2^40 = 1,099,511,627,776}.
752 @cindex petabyte, definition of
753 petabyte: @math{10^15 = 1,000,000,000,000,000}.
756 @cindex pebibyte, definition of
757 pebibyte: @math{2^50 = 1,125,899,906,842,624}.
759 @cindex exabyte, definition of
760 exabyte: @math{10^18 = 1,000,000,000,000,000,000}.
763 @cindex exbibyte, definition of
764 exbibyte: @math{2^60 = 1,152,921,504,606,846,976}.
766 @cindex zettabyte, definition of
767 zettabyte: @math{10^21 = 1,000,000,000,000,000,000,000}
770 @math{2^70 = 1,180,591,620,717,411,303,424}.
771 (@samp{Zi} is a GNU extension to IEC 60027-2.)
773 @cindex yottabyte, definition of
774 yottabyte: @math{10^24 = 1,000,000,000,000,000,000,000,000}.
777 @math{2^80 = 1,208,925,819,614,629,174,706,176}.
778 (@samp{Yi} is a GNU extension to IEC 60027-2.)
783 @opindex --block-size
784 @opindex --human-readable
787 Block size defaults can be overridden by an explicit
788 @option{--block-size=@var{size}} option. The @option{-k}
789 option is equivalent to @option{--block-size=1K}, which
790 is the default unless the @env{POSIXLY_CORRECT} environment variable is
791 set. The @option{-h} or @option{--human-readable} option is equivalent to
792 @option{--block-size=human-readable}. The @option{--si} option is
793 equivalent to @option{--block-size=si}.
795 @node Target directory
796 @section Target directory
798 @cindex target directory
800 Some @sc{gnu} programs (at least @command{cp}, @code{install}, @code{ln}, and
801 @command{mv}) allow you to specify the target directory via this option:
805 @itemx @w{@kbd{--target-directory}=@var{directory}}
806 @opindex --target-directory
807 @cindex target directory
808 @cindex destination directory
809 Specify the destination @var{directory}.
811 The interface for most programs is that after processing options and a
812 finite (possibly zero) number of fixed-position arguments, the remaining
813 argument list is either expected to be empty, or is a list of items
814 (usually files) that will all be handled identically. The @code{xargs}
815 program is designed to work well with this convention.
817 The commands in the @command{mv}-family are unusual in that they take
818 a variable number of arguments with a special case at the @emph{end}
819 (namely, the target directory). This makes it nontrivial to perform some
820 operations, e.g., ``move all files from here to ../d/'', because
821 @code{mv * ../d/} might exhaust the argument space, and @code{ls | xargs ...}
822 doesn't have a clean way to specify an extra final argument for each
823 invocation of the subject command. (It can be done by going through a
824 shell command, but that requires more human labor and brain power than
827 The @w{@kbd{--target-directory}} option allows the @command{cp},
828 @command{install}, @code{ln}, and @code{mv} programs to be used conveniently
829 with @code{xargs}. For example, you can move the files from the
830 current directory to a sibling directory, @code{d} like this:
831 (However, this doesn't move files whose names begin with @samp{.}.)
834 ls |xargs mv --target-directory=../d
837 If you use the @sc{gnu} @code{find} program, you can move @emph{all}
838 files with this command:
840 find . -mindepth 1 -maxdepth 1 \
841 | xargs mv --target-directory=../d
844 But that will fail if there are no files in the current directory
845 or if any file has a name containing a newline character.
846 The following example removes those limitations and requires both
847 @sc{gnu} @code{find} and @sc{gnu} @code{xargs}:
849 find . -mindepth 1 -maxdepth 1 -print0 \
850 | xargs --null --no-run-if-empty \
851 mv --target-directory=../d
856 @node Trailing slashes
857 @section Trailing slashes
859 @cindex trailing slashes
861 Some @sc{gnu} programs (at least @command{cp} and @code{mv}) allow you to
862 remove any trailing slashes from each @var{source} argument before
863 operating on it. The @w{@kbd{--strip-trailing-slashes}} option enables
866 This is useful when a @var{source} argument may have a trailing slash and
867 @c FIXME: mv's behavior in this case is system-dependent
868 specify a symbolic link to a directory. This scenario is in fact rather
869 common because some shells can automatically append a trailing slash when
870 performing file name completion on such symbolic links. Without this
871 option, @command{mv}, for example, (via the system's rename function) must
872 interpret a trailing slash as a request to dereference the symbolic link
873 and so must rename the indirectly referenced @emph{directory} and not
874 the symbolic link. Although it may seem surprising that such behavior
875 be the default, it is required by @acronym{POSIX} and is consistent with
876 other parts of that standard.
878 @node Standards conformance
879 @section Standards conformance
881 @vindex POSIXLY_CORRECT
882 In a few cases, the @sc{gnu} utilities' default behavior is
883 incompatible with the @acronym{POSIX} standard. To suppress these
884 incompatibilities, define the @env{POSIXLY_CORRECT} environment
885 variable. Unless you are checking for @acronym{POSIX} conformance, you
886 probably do not need to define @env{POSIXLY_CORRECT}.
888 Newer versions of @acronym{POSIX} are occasionally incompatible with older
889 versions. For example, older versions of @acronym{POSIX} required the
890 command @samp{sort +1} to sort based on the second and succeeding
891 fields in each input line, but starting with @acronym{POSIX} 1003.1-2001
892 the same command is required to sort the file named @file{+1}, and you
893 must instead use the command @samp{sort -k 2} to get the field-based
896 @vindex _POSIX2_VERSION
897 The @sc{gnu} utilities normally conform to the version of @acronym{POSIX}
898 that is standard for your system. To cause them to conform to a
899 different version of @acronym{POSIX}, define the @env{_POSIX2_VERSION}
900 environment variable to a value of the form @var{yyyymm} specifying
901 the year and month the standard was adopted. Two values are currently
902 supported for @env{_POSIX2_VERSION}: @samp{199209} stands for
903 @acronym{POSIX} 1003.2-1992, and @samp{200112} stands for @acronym{POSIX}
904 1003.1-2001. For example, if you are running older software that
905 assumes an older version of @acronym{POSIX} and uses @samp{sort +1}, you
906 can work around the compatibility problems by setting
907 @samp{_POSIX2_VERSION=199209} in your environment.
909 @node Output of entire files
910 @chapter Output of entire files
912 @cindex output of entire files
913 @cindex entire files, output of
915 These commands read and write entire files, possibly transforming them
919 * cat invocation:: Concatenate and write files.
920 * tac invocation:: Concatenate and write files in reverse.
921 * nl invocation:: Number lines and write files.
922 * od invocation:: Write files in octal or other formats.
926 @section @command{cat}: Concatenate and write files
929 @cindex concatenate and write files
930 @cindex copying files
932 @command{cat} copies each @var{file} (@samp{-} means standard input), or
933 standard input if none are given, to standard output. Synopsis:
936 cat [@var{option}] [@var{file}]@dots{}
939 The program accepts the following options. Also see @ref{Common options}.
947 Equivalent to @option{-vET}.
953 @cindex binary and text I/O in cat
954 On MS-DOS and MS-Windows only, read and write the files in binary mode.
955 By default, @command{cat} on MS-DOS/MS-Windows uses binary mode only when
956 standard output is redirected to a file or a pipe; this option overrides
957 that. Binary file I/O is used so that the files retain their format
958 (Unix text as opposed to DOS text and binary), because @command{cat} is
959 frequently used as a file-copying program. Some options (see below)
960 cause @command{cat} to read and write files in text mode because in those
961 cases the original file contents aren't important (e.g., when lines are
962 numbered by @command{cat}, or when line endings should be marked). This is
963 so these options work as DOS/Windows users would expect; for example,
964 DOS-style text files have their lines end with the CR-LF pair of
965 characters, which won't be processed as an empty line by @option{-b} unless
966 the file is read in text mode.
969 @itemx --number-nonblank
971 @opindex --number-nonblank
972 Number all nonblank output lines, starting with 1. On MS-DOS and
973 MS-Windows, this option causes @command{cat} to read and write files in
978 Equivalent to @option{-vE}.
984 Display a @samp{$} after the end of each line. On MS-DOS and
985 MS-Windows, this option causes @command{cat} to read and write files in
992 Number all output lines, starting with 1. On MS-DOS and MS-Windows,
993 this option causes @command{cat} to read and write files in text mode.
996 @itemx --squeeze-blank
998 @opindex --squeeze-blank
999 @cindex squeezing blank lines
1000 Replace multiple adjacent blank lines with a single blank line. On
1001 MS-DOS and MS-Windows, this option causes @command{cat} to read and write
1006 Equivalent to @option{-vT}.
1011 @opindex --show-tabs
1012 Display TAB characters as @samp{^I}.
1016 Ignored; for Unix compatibility.
1019 @itemx --show-nonprinting
1021 @opindex --show-nonprinting
1022 Display control characters except for LFD and TAB using
1023 @samp{^} notation and precede characters that have the high bit set with
1024 @samp{M-}. On MS-DOS and MS-Windows, this option causes @command{cat} to
1025 read files and standard input in DOS binary mode, so the CR
1026 characters at the end of each line are also visible.
1031 @node tac invocation
1032 @section @command{tac}: Concatenate and write files in reverse
1035 @cindex reversing files
1037 @command{tac} copies each @var{file} (@samp{-} means standard input), or
1038 standard input if none are given, to standard output, reversing the
1039 records (lines by default) in each separately. Synopsis:
1042 tac [@var{option}]@dots{} [@var{file}]@dots{}
1045 @dfn{Records} are separated by instances of a string (newline by
1046 default). By default, this separator string is attached to the end of
1047 the record that it follows in the file.
1049 The program accepts the following options. Also see @ref{Common options}.
1057 The separator is attached to the beginning of the record that it
1058 precedes in the file.
1064 Treat the separator string as a regular expression. Users of @command{tac}
1065 on MS-DOS/MS-Windows should note that, since @command{tac} reads files in
1066 binary mode, each line of a text file might end with a CR/LF pair
1067 instead of the Unix-style LF.
1069 @item -s @var{separator}
1070 @itemx --separator=@var{separator}
1072 @opindex --separator
1073 Use @var{separator} as the record separator, instead of newline.
1079 @section @command{nl}: Number lines and write files
1082 @cindex numbering lines
1083 @cindex line numbering
1085 @command{nl} writes each @var{file} (@samp{-} means standard input), or
1086 standard input if none are given, to standard output, with line numbers
1087 added to some or all of the lines. Synopsis:
1090 nl [@var{option}]@dots{} [@var{file}]@dots{}
1093 @cindex logical pages, numbering on
1094 @command{nl} decomposes its input into (logical) pages; by default, the
1095 line number is reset to 1 at the top of each logical page. @command{nl}
1096 treats all of the input files as a single document; it does not reset
1097 line numbers or logical pages between files.
1099 @cindex headers, numbering
1100 @cindex body, numbering
1101 @cindex footers, numbering
1102 A logical page consists of three sections: header, body, and footer.
1103 Any of the sections can be empty. Each can be numbered in a different
1104 style from the others.
1106 The beginnings of the sections of logical pages are indicated in the
1107 input file by a line containing exactly one of these delimiter strings:
1118 The two characters from which these strings are made can be changed from
1119 @samp{\} and @samp{:} via options (see below), but the pattern and
1120 length of each string cannot be changed.
1122 A section delimiter is replaced by an empty line on output. Any text
1123 that comes before the first section delimiter string in the input file
1124 is considered to be part of a body section, so @command{nl} treats a
1125 file that contains no section delimiters as a single body section.
1127 The program accepts the following options. Also see @ref{Common options}.
1131 @item -b @var{style}
1132 @itemx --body-numbering=@var{style}
1134 @opindex --body-numbering
1135 Select the numbering style for lines in the body section of each
1136 logical page. When a line is not numbered, the current line number
1137 is not incremented, but the line number separator character is still
1138 prepended to the line. The styles are:
1144 number only nonempty lines (default for body),
1146 do not number lines (default for header and footer),
1148 number only lines that contain a match for @var{regexp}.
1152 @itemx --section-delimiter=@var{cd}
1154 @opindex --section-delimiter
1155 @cindex section delimiters of pages
1156 Set the section delimiter characters to @var{cd}; default is
1157 @samp{\:}. If only @var{c} is given, the second remains @samp{:}.
1158 (Remember to protect @samp{\} or other metacharacters from shell
1159 expansion with quotes or extra backslashes.)
1161 @item -f @var{style}
1162 @itemx --footer-numbering=@var{style}
1164 @opindex --footer-numbering
1165 Analogous to @option{--body-numbering}.
1167 @item -h @var{style}
1168 @itemx --header-numbering=@var{style}
1170 @opindex --header-numbering
1171 Analogous to @option{--body-numbering}.
1173 @item -i @var{number}
1174 @itemx --page-increment=@var{number}
1176 @opindex --page-increment
1177 Increment line numbers by @var{number} (default 1).
1179 @item -l @var{number}
1180 @itemx --join-blank-lines=@var{number}
1182 @opindex --join-blank-lines
1183 @cindex empty lines, numbering
1184 @cindex blank lines, numbering
1185 Consider @var{number} (default 1) consecutive empty lines to be one
1186 logical line for numbering, and only number the last one. Where fewer
1187 than @var{number} consecutive empty lines occur, do not number them.
1188 An empty line is one that contains no characters, not even spaces
1191 @item -n @var{format}
1192 @itemx --number-format=@var{format}
1194 @opindex --number-format
1195 Select the line numbering format (default is @code{rn}):
1199 @opindex ln @r{format for @command{nl}}
1200 left justified, no leading zeros;
1202 @opindex rn @r{format for @command{nl}}
1203 right justified, no leading zeros;
1205 @opindex rz @r{format for @command{nl}}
1206 right justified, leading zeros.
1210 @itemx --no-renumber
1212 @opindex --no-renumber
1213 Do not reset the line number at the start of a logical page.
1215 @item -s @var{string}
1216 @itemx --number-separator=@var{string}
1218 @opindex --number-separator
1219 Separate the line number from the text line in the output with
1220 @var{string} (default is the TAB character).
1222 @item -v @var{number}
1223 @itemx --starting-line-number=@var{number}
1225 @opindex --starting-line-number
1226 Set the initial line number on each logical page to @var{number} (default 1).
1228 @item -w @var{number}
1229 @itemx --number-width=@var{number}
1231 @opindex --number-width
1232 Use @var{number} characters for line numbers (default 6).
1238 @section @command{od}: Write files in octal or other formats
1241 @cindex octal dump of files
1242 @cindex hex dump of files
1243 @cindex ASCII dump of files
1244 @cindex file contents, dumping unambiguously
1246 @command{od} writes an unambiguous representation of each @var{file}
1247 (@samp{-} means standard input), or standard input if none are given.
1251 od [@var{option}]@dots{} [@var{file}]@dots{}
1252 od --traditional [@var{file}] [[+]@var{offset} [[+]@var{label}]]
1255 Each line of output consists of the offset in the input, followed by
1256 groups of data from the file. By default, @command{od} prints the offset in
1257 octal, and each group of file data is two bytes of input printed as a
1258 single octal number.
1260 The program accepts the following options. Also see @ref{Common options}.
1264 @item -A @var{radix}
1265 @itemx --address-radix=@var{radix}
1267 @opindex --address-radix
1268 @cindex radix for file offsets
1269 @cindex file offset radix
1270 Select the base in which file offsets are printed. @var{radix} can
1271 be one of the following:
1281 none (do not print offsets).
1284 The default is octal.
1286 @item -j @var{bytes}
1287 @itemx --skip-bytes=@var{bytes}
1289 @opindex --skip-bytes
1290 Skip @var{bytes} input bytes before formatting and writing. If
1291 @var{bytes} begins with @samp{0x} or @samp{0X}, it is interpreted in
1292 hexadecimal; otherwise, if it begins with @samp{0}, in octal; otherwise,
1293 in decimal. Appending @samp{b} multiplies @var{bytes} by 512, @samp{k}
1294 by 1024, and @samp{m} by 1048576.
1296 @item -N @var{bytes}
1297 @itemx --read-bytes=@var{bytes}
1299 @opindex --read-bytes
1300 Output at most @var{bytes} bytes of the input. Prefixes and suffixes on
1301 @code{bytes} are interpreted as for the @option{-j} option.
1304 @itemx --strings[=@var{n}]
1307 @cindex string constants, outputting
1308 Instead of the normal output, output only @dfn{string constants}: at
1309 least @var{n} consecutive @acronym{ASCII} graphic characters,
1310 followed by a null (zero) byte.
1312 If @var{n} is omitted with @option{--strings}, the default is 3. On
1313 older systems, @sc{gnu} @command{od} instead supports an obsolete
1314 option @option{-s[@var{n}]}, where @var{n} also defaults to 3.
1315 @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance}) does not allow
1316 @option{-s} without an argument; use @option{--strings} instead.
1319 @itemx --format=@var{type}
1322 Select the format in which to output the file data. @var{type} is a
1323 string of one or more of the below type indicator characters. If you
1324 include more than one type indicator character in a single @var{type}
1325 string, or use this option more than once, @command{od} writes one copy
1326 of each output line using each of the data types that you specified,
1327 in the order that you specified.
1329 Adding a trailing ``z'' to any type specification appends a display
1330 of the @acronym{ASCII} character representation of the printable characters
1331 to the output line generated by the type specification.
1337 @acronym{ASCII} character or backslash escape,
1350 The type @code{a} outputs things like @samp{sp} for space, @samp{nl} for
1351 newline, and @samp{nul} for a null (zero) byte. Type @code{c} outputs
1352 @samp{ }, @samp{\n}, and @code{\0}, respectively.
1355 Except for types @samp{a} and @samp{c}, you can specify the number
1356 of bytes to use in interpreting each number in the given data type
1357 by following the type indicator character with a decimal integer.
1358 Alternately, you can specify the size of one of the C compiler's
1359 built-in data types by following the type indicator character with
1360 one of the following characters. For integers (@samp{d}, @samp{o},
1361 @samp{u}, @samp{x}):
1374 For floating point (@code{f}):
1386 @itemx --output-duplicates
1388 @opindex --output-duplicates
1389 Output consecutive lines that are identical. By default, when two or
1390 more consecutive output lines would be identical, @command{od} outputs only
1391 the first line, and puts just an asterisk on the following line to
1392 indicate the elision.
1395 @itemx --width[=@var{n}]
1398 Dump @code{n} input bytes per output line. This must be a multiple of
1399 the least common multiple of the sizes associated with the specified
1402 If this option is not given at all, the default is 16. If @var{n} is
1403 omitted with @option{--width}, the default is 32. On older systems,
1404 @sc{gnu} @command{od} instead supports an obsolete option
1405 @option{-w[@var{n}]}, where @var{n} also defaults to 32. @acronym{POSIX}
1406 1003.1-2001 (@pxref{Standards conformance}) does not allow @option{-w}
1407 without an argument; use @option{--width} instead.
1411 The next several options are shorthands for format specifications.
1412 @sc{gnu} @command{od} accepts any combination of shorthands and format
1413 specification options. These options accumulate.
1419 Output as named characters. Equivalent to @option{-ta}.
1423 Output as octal bytes. Equivalent to @option{-toC}.
1427 Output as @acronym{ASCII} characters or backslash escapes. Equivalent to
1432 Output as unsigned decimal shorts. Equivalent to @option{-tu2}.
1436 Output as floats. Equivalent to @option{-tfF}.
1440 Output as hexadecimal shorts. Equivalent to @option{-tx2}.
1444 Output as decimal shorts. Equivalent to @option{-td2}.
1448 Output as decimal longs. Equivalent to @option{-td4}.
1452 Output as octal shorts. Equivalent to @option{-to2}.
1456 Output as hexadecimal shorts. Equivalent to @option{-tx2}.
1459 @opindex --traditional
1460 Recognize the non-option arguments that traditional @command{od}
1461 accepted. The following syntax:
1464 od --traditional [@var{file}] [[+]@var{offset}[.][b] [[+]@var{label}[.][b]]]
1468 can be used to specify at most one file and optional arguments
1469 specifying an offset and a pseudo-start address, @var{label}. By
1470 default, @var{offset} is interpreted as an octal number specifying how
1471 many input bytes to skip before formatting and writing. The optional
1472 trailing decimal point forces the interpretation of @var{offset} as a
1473 decimal number. If no decimal is specified and the offset begins with
1474 @samp{0x} or @samp{0X} it is interpreted as a hexadecimal number. If
1475 there is a trailing @samp{b}, the number of bytes skipped will be
1476 @var{offset} multiplied by 512. The @var{label} argument is interpreted
1477 just like @var{offset}, but it specifies an initial pseudo-address. The
1478 pseudo-addresses are displayed in parentheses following any normal
1484 @node Formatting file contents
1485 @chapter Formatting file contents
1487 @cindex formatting file contents
1489 These commands reformat the contents of files.
1492 * fmt invocation:: Reformat paragraph text.
1493 * pr invocation:: Paginate or columnate files for printing.
1494 * fold invocation:: Wrap input lines to fit in specified width.
1498 @node fmt invocation
1499 @section @command{fmt}: Reformat paragraph text
1502 @cindex reformatting paragraph text
1503 @cindex paragraphs, reformatting
1504 @cindex text, reformatting
1506 @command{fmt} fills and joins lines to produce output lines of (at most)
1507 a given number of characters (75 by default). Synopsis:
1510 fmt [@var{option}]@dots{} [@var{file}]@dots{}
1513 @command{fmt} reads from the specified @var{file} arguments (or standard
1514 input if none are given), and writes to standard output.
1516 By default, blank lines, spaces between words, and indentation are
1517 preserved in the output; successive input lines with different
1518 indentation are not joined; tabs are expanded on input and introduced on
1521 @cindex line-breaking
1522 @cindex sentences and line-breaking
1523 @cindex Knuth, Donald E.
1524 @cindex Plass, Michael F.
1525 @command{fmt} prefers breaking lines at the end of a sentence, and tries to
1526 avoid line breaks after the first word of a sentence or before the last
1527 word of a sentence. A @dfn{sentence break} is defined as either the end
1528 of a paragraph or a word ending in any of @samp{.?!}, followed by two
1529 spaces or end of line, ignoring any intervening parentheses or quotes.
1530 Like @TeX{}, @command{fmt} reads entire ``paragraphs'' before choosing line
1531 breaks; the algorithm is a variant of that in ``Breaking Paragraphs Into
1532 Lines'' (Donald E. Knuth and Michael F. Plass, @cite{Software---Practice
1533 and Experience}, 11 (1981), 1119--1184).
1535 The program accepts the following options. Also see @ref{Common options}.
1540 @itemx --crown-margin
1542 @opindex --crown-margin
1543 @cindex crown margin
1544 @dfn{Crown margin} mode: preserve the indentation of the first two
1545 lines within a paragraph, and align the left margin of each subsequent
1546 line with that of the second line.
1549 @itemx --tagged-paragraph
1551 @opindex --tagged-paragraph
1552 @cindex tagged paragraphs
1553 @dfn{Tagged paragraph} mode: like crown margin mode, except that if
1554 indentation of the first line of a paragraph is the same as the
1555 indentation of the second, the first line is treated as a one-line
1561 @opindex --split-only
1562 Split lines only. Do not join short lines to form longer ones. This
1563 prevents sample lines of code, and other such ``formatted'' text from
1564 being unduly combined.
1567 @itemx --uniform-spacing
1569 @opindex --uniform-spacing
1570 Uniform spacing. Reduce spacing between words to one space, and spacing
1571 between sentences to two spaces.
1574 @itemx -w @var{width}
1575 @itemx --width=@var{width}
1576 @opindex -@var{width}
1579 Fill output lines up to @var{width} characters (default 75). @command{fmt}
1580 initially tries to make lines about 7% shorter than this, to give it
1581 room to balance line lengths.
1583 @item -p @var{prefix}
1584 @itemx --prefix=@var{prefix}
1585 Only lines beginning with @var{prefix} (possibly preceded by whitespace)
1586 are subject to formatting. The prefix and any preceding whitespace are
1587 stripped for the formatting and then re-attached to each formatted output
1588 line. One use is to format certain kinds of program comments, while
1589 leaving the code unchanged.
1595 @section @command{pr}: Paginate or columnate files for printing
1598 @cindex printing, preparing files for
1599 @cindex multicolumn output, generating
1600 @cindex merging files in parallel
1602 @command{pr} writes each @var{file} (@samp{-} means standard input), or
1603 standard input if none are given, to standard output, paginating and
1604 optionally outputting in multicolumn format; optionally merges all
1605 @var{file}s, printing all in parallel, one per column. Synopsis:
1608 pr [@var{option}]@dots{} [@var{file}]@dots{}
1612 By default, a 5-line header is printed at each page: two blank lines;
1613 a line with the date, the filename, and the page count; and two more
1614 blank lines. A footer of five blank lines is also printed.
1615 With the @option{-F}
1616 option, a 3-line header is printed: the leading two blank lines are
1617 omitted; no footer is used. The default @var{page_length} in both cases is 66
1618 lines. The default number of text lines changes from 56 (without @option{-F})
1619 to 63 (with @option{-F}). The text line of the header takes the form
1620 @samp{@var{date} @var{string} @var{page}}, with spaces inserted around
1621 @var{string} so that the line takes up the full @var{page_width}. Here,
1622 @var{date} is the date (see the @option{-D} or @option{--date-format}
1623 option for details), @var{string} is the centered header string, and
1624 @var{page} identifies the page number. The @env{LC_MESSAGES} locale
1625 category affects the spelling of @var{page}; in the default C locale, it
1626 is @samp{Page @var{number}} where @var{number} is the decimal page
1629 Form feeds in the input cause page breaks in the output. Multiple form
1630 feeds produce empty pages.
1632 Columns are of equal width, separated by an optional string (default
1633 is @samp{space}). For multicolumn output, lines will always be truncated to
1634 @var{page_width} (default 72), unless you use the @option{-J} option.
1636 column output no line truncation occurs by default. Use @option{-W} option to
1637 truncate lines in that case.
1639 The following changes were made in version 1.22i and apply to later
1640 versions of @command{pr}:
1641 @c FIXME: this whole section here sounds very awkward to me. I
1642 @c made a few small changes, but really it all needs to be redone. - Brian
1643 @c OK, I fixed another sentence or two, but some of it I just don't understand.
1648 Some small @var{letter options} (@option{-s}, @option{-w}) have been
1649 redefined for better @acronym{POSIX} compliance. The output of some further
1650 cases has been adapted to other Unix systems. These changes are not
1651 compatible with earlier versions of the program.
1654 Some @var{new capital letter} options (@option{-J}, @option{-S}, @option{-W})
1655 have been introduced to turn off unexpected interferences of small letter
1656 options. The @option{-N} option and the second argument @var{last_page}
1657 of @samp{+FIRST_PAGE} offer more flexibility. The detailed handling of
1658 form feeds set in the input files requires the @option{-T} option.
1661 Capital letter options override small letter ones.
1664 Some of the option-arguments (compare @option{-s}, @option{-e},
1665 @option{-i}, @option{-n}) cannot be specified as separate arguments from the
1666 preceding option letter (already stated in the @acronym{POSIX} specification).
1669 The program accepts the following options. Also see @ref{Common options}.
1673 @item +@var{first_page}[:@var{last_page}]
1674 @itemx --pages=@var{first_page}[:@var{last_page}]
1675 @c The two following @opindex lines evoke warnings because they contain `:'
1676 @c The `info' spec does not permit that. If we use those lines, we end
1677 @c up with truncated index entries that don't work.
1678 @c @opindex +@var{first_page}[:@var{last_page}]
1679 @c @opindex --pages=@var{first_page}[:@var{last_page}]
1680 @opindex +@var{page_range}
1681 @opindex --pages=@var{page_range}
1682 Begin printing with page @var{first_page} and stop with @var{last_page}.
1683 Missing @samp{:@var{last_page}} implies end of file. While estimating
1684 the number of skipped pages each form feed in the input file results
1685 in a new page. Page counting with and without @samp{+@var{first_page}}
1686 is identical. By default, counting starts with the first page of input
1687 file (not first page printed). Line numbering may be altered by @option{-N}
1691 @itemx --columns=@var{column}
1692 @opindex -@var{column}
1694 @cindex down columns
1695 With each single @var{file}, produce @var{column} columns of output
1696 (default is 1) and print columns down, unless @option{-a} is used. The
1697 column width is automatically decreased as @var{column} increases; unless
1698 you use the @option{-W/-w} option to increase @var{page_width} as well.
1699 This option might well cause some lines to be truncated. The number of
1700 lines in the columns on each page are balanced. The options @option{-e}
1701 and @option{-i} are on for multiple text-column output. Together with
1702 @option{-J} option column alignment and line truncation is turned off.
1703 Lines of full length are joined in a free field format and @option{-S}
1704 option may set field separators. @option{-@var{column}} may not be used
1705 with @option{-m} option.
1711 @cindex across columns
1712 With each single @var{file}, print columns across rather than down. The
1713 @option{-@var{column}} option must be given with @var{column} greater than one.
1714 If a line is too long to fit in a column, it is truncated.
1717 @itemx --show-control-chars
1719 @opindex --show-control-chars
1720 Print control characters using hat notation (e.g., @samp{^G}); print
1721 other nonprinting characters in octal backslash notation. By default,
1722 nonprinting characters are not changed.
1725 @itemx --double-space
1727 @opindex --double-space
1728 @cindex double spacing
1729 Double space the output.
1731 @item -D @var{format}
1732 @itemx --date-format=@var{format}
1733 @cindex time formats
1734 @cindex formatting times
1735 Format header dates using @var{format}, using the same conventions as
1736 for the the command @samp{date +@var{format}}; @xref{date invocation}.
1737 Except for directives, which start with
1738 @samp{%}, characters in @var{format} are printed unchanged. You can use
1739 this option to specify an arbitrary string in place of the header date,
1740 e.g., @option{--date-format="Monday morning"}.
1742 @vindex POSIXLY_CORRECT
1744 If the @env{POSIXLY_CORRECT} environment variable is not set, the date
1745 format defaults to @samp{%Y-%m-%d %H:%M} (for example, @samp{2001-12-04
1746 23:59}); otherwise, the format depends on the @env{LC_TIME} locale
1747 category, with the default being @samp{%b %e %H:%M %Y} (for example,
1748 @samp{Dec@ @ 4 23:59 2001}.
1750 @item -e[@var{in-tabchar}[@var{in-tabwidth}]]
1751 @itemx --expand-tabs[=@var{in-tabchar}[@var{in-tabwidth}]]
1753 @opindex --expand-tabs
1755 Expand @var{tab}s to spaces on input. Optional argument @var{in-tabchar} is
1756 the input tab character (default is the TAB character). Second optional
1757 argument @var{in-tabwidth} is the input tab character's width (default
1765 @opindex --form-feed
1766 Use a form feed instead of newlines to separate output pages. The default
1767 page length of 66 lines is not altered. But the number of lines of text
1768 per page changes from default 56 to 63 lines.
1770 @item -h @var{HEADER}
1771 @itemx --header=@var{HEADER}
1774 Replace the filename in the header with the centered string @var{header}.
1775 When using the shell, @var{header} should be quoted and should be
1776 separated from @option{-h} by a space.
1778 @item -i[@var{out-tabchar}[@var{out-tabwidth}]]
1779 @itemx --output-tabs[=@var{out-tabchar}[@var{out-tabwidth}]]
1781 @opindex --output-tabs
1783 Replace spaces with @var{tab}s on output. Optional argument @var{out-tabchar}
1784 is the output tab character (default is the TAB character). Second optional
1785 argument @var{out-tabwidth} is the output tab character's width (default
1791 @opindex --join-lines
1792 Merge lines of full length. Used together with the column options
1793 @option{-@var{column}}, @option{-a -@var{column}} or @option{-m}. Turns off
1794 @option{-W/-w} line truncation;
1795 no column alignment used; may be used with
1796 @option{--sep-string[=@var{string}]}. @option{-J} has been introduced
1797 (together with @option{-W} and @option{--sep-string})
1798 to disentangle the old (@acronym{POSIX}-compliant) options @option{-w} and
1799 @option{-s} along with the three column options.
1802 @item -l @var{page_length}
1803 @itemx --length=@var{page_length}
1806 Set the page length to @var{page_length} (default 66) lines, including
1807 the lines of the header [and the footer]. If @var{page_length} is less
1808 than or equal to 10 (or <= 3 with @option{-F}), the header and footer are
1809 omitted, and all form feeds set in input files are eliminated, as if
1810 the @option{-T} option had been given.
1816 Merge and print all @var{file}s in parallel, one in each column. If a
1817 line is too long to fit in a column, it is truncated, unless the @option{-J}
1818 option is used. @option{--sep-string[=@var{string}]} may be used.
1820 some @var{file}s (form feeds set) produce empty columns, still marked
1821 by @var{string}. The result is a continuous line numbering and column
1822 marking throughout the whole merged file. Completely empty merged pages
1823 show no separators or line numbers. The default header becomes
1824 @samp{@var{date} @var{page}} with spaces inserted in the middle; this
1825 may be used with the @option{-h} or @option{--header} option to fill up
1826 the middle blank part.
1828 @item -n[@var{number-separator}[@var{digits}]]
1829 @itemx --number-lines[=@var{number-separator}[@var{digits}]]
1831 @opindex --number-lines
1832 Provide @var{digits} digit line numbering (default for @var{digits} is
1833 5). With multicolumn output the number occupies the first @var{digits}
1834 column positions of each text column or only each line of @option{-m}
1835 output. With single column output the number precedes each line just as
1836 @option{-m} does. Default counting of the line numbers starts with the
1837 first line of the input file (not the first line printed, compare the
1838 @option{--page} option and @option{-N} option).
1839 Optional argument @var{number-separator} is the character appended to
1840 the line number to separate it from the text followed. The default
1841 separator is the TAB character. In a strict sense a TAB is always
1842 printed with single column output only. The @var{TAB}-width varies
1843 with the @var{TAB}-position, e.g. with the left @var{margin} specified
1844 by @option{-o} option. With multicolumn output priority is given to
1845 @samp{equal width of output columns} (a @acronym{POSIX} specification).
1846 The @var{TAB}-width is fixed to the value of the first column and does
1847 not change with different values of left @var{margin}. That means a
1848 fixed number of spaces is always printed in the place of the
1849 @var{number-separator tab}. The tabification depends upon the output
1852 @item -N @var{line_number}
1853 @itemx --first-line-number=@var{line_number}
1855 @opindex --first-line-number
1856 Start line counting with the number @var{line_number} at first line of
1857 first page printed (in most cases not the first line of the input file).
1859 @item -o @var{margin}
1860 @itemx --indent=@var{margin}
1863 @cindex indenting lines
1865 Indent each line with a margin @var{margin} spaces wide (default is zero).
1866 The total page width is the size of the margin plus the @var{page_width}
1867 set with the @option{-W/-w} option. A limited overflow may occur with
1868 numbered single column output (compare @option{-n} option).
1871 @itemx --no-file-warnings
1873 @opindex --no-file-warnings
1874 Do not print a warning message when an argument @var{file} cannot be
1875 opened. (The exit status will still be nonzero, however.)
1877 @item -s[@var{char}]
1878 @itemx --separator[=@var{char}]
1880 @opindex --separator
1881 Separate columns by a single character @var{char}. The default for
1882 @var{char} is the TAB character without @option{-w} and @samp{no
1883 character} with @option{-w}. Without @option{-s} the default separator
1884 @samp{space} is set. @option{-s[char]} turns off line truncation of all
1885 three column options (@option{-COLUMN}|@option{-a -COLUMN}|@option{-m}) unless
1886 @option{-w} is set. This is a @acronym{POSIX}-compliant formulation.
1889 @item -S @var{string}
1890 @itemx --sep-string[=@var{string}]
1892 @opindex --sep-string
1893 Use @var{string} to separate output columns. The @option{-S} option doesn't
1894 affect the @option{-W/-w} option, unlike the @option{-s} option which does. It
1895 does not affect line truncation or column alignment.
1896 Without @option{-S}, and with @option{-J}, @command{pr} uses the default output
1898 Without @option{-S} or @option{-J}, @command{pr} uses a @samp{space}
1899 (same as @option{-S"@w{ }"}). With @option{-S@var{string}},
1900 @var{string} must be nonempty; @option{--sep-string} with no
1901 @var{string} is equivalent to @option{--sep-string=""}.
1903 On older systems, @command{pr} instead supports an obsolete option
1904 @option{-S[@var{string}]}, where @var{string} is optional. @acronym{POSIX}
1905 1003.1-2001 (@pxref{Standards conformance}) does not allow this older
1906 usage. To specify an empty @var{string} portably, use
1907 @option{--sep-string}.
1910 @itemx --omit-header
1912 @opindex --omit-header
1913 Do not print the usual header [and footer] on each page, and do not fill
1914 out the bottom of pages (with blank lines or a form feed). No page
1915 structure is produced, but form feeds set in the input files are retained.
1916 The predefined pagination is not changed. @option{-t} or @option{-T} may be
1917 useful together with other options; e.g.: @option{-t -e4}, expand TAB characters
1918 in the input file to 4 spaces but don't make any other changes. Use of
1919 @option{-t} overrides @option{-h}.
1922 @itemx --omit-pagination
1924 @opindex --omit-pagination
1925 Do not print header [and footer]. In addition eliminate all form feeds
1926 set in the input files.
1929 @itemx --show-nonprinting
1931 @opindex --show-nonprinting
1932 Print nonprinting characters in octal backslash notation.
1934 @item -w @var{page_width}
1935 @itemx --width=@var{page_width}
1938 Set page width to @var{page_width} characters for multiple text-column
1939 output only (default for @var{page_width} is 72). @option{-s[CHAR]} turns
1940 off the default page width and any line truncation and column alignment.
1941 Lines of full length are merged, regardless of the column options
1942 set. No @var{page_width} setting is possible with single column output.
1943 A @acronym{POSIX}-compliant formulation.
1945 @item -W @var{page_width}
1946 @itemx --page_width=@var{page_width}
1948 @opindex --page_width
1949 Set the page width to @var{page_width} characters. That's valid with and
1950 without a column option. Text lines are truncated, unless @option{-J}
1951 is used. Together with one of the three column options
1952 (@option{-@var{column}}, @option{-a -@var{column}} or @option{-m}) column
1953 alignment is always used. The separator options @option{-S} or @option{-s}
1954 don't affect the @option{-W} option. Default is 72 characters. Without
1955 @option{-W @var{page_width}} and without any of the column options NO line
1956 truncation is used (defined to keep downward compatibility and to meet
1957 most frequent tasks). That's equivalent to @option{-W 72 -J}. The header
1958 line is never truncated.
1963 @node fold invocation
1964 @section @command{fold}: Wrap input lines to fit in specified width
1967 @cindex wrapping long input lines
1968 @cindex folding long input lines
1970 @command{fold} writes each @var{file} (@option{-} means standard input), or
1971 standard input if none are given, to standard output, breaking long
1975 fold [@var{option}]@dots{} [@var{file}]@dots{}
1978 By default, @command{fold} breaks lines wider than 80 columns. The output
1979 is split into as many lines as necessary.
1981 @cindex screen columns
1982 @command{fold} counts screen columns by default; thus, a tab may count more
1983 than one column, backspace decreases the column count, and carriage
1984 return sets the column to zero.
1986 The program accepts the following options. Also see @ref{Common options}.
1994 Count bytes rather than columns, so that tabs, backspaces, and carriage
1995 returns are each counted as taking up one column, just like other
2002 Break at word boundaries: the line is broken after the last blank before
2003 the maximum line length. If the line contains no such blanks, the line
2004 is broken at the maximum line length as usual.
2006 @item -w @var{width}
2007 @itemx --width=@var{width}
2010 Use a maximum line length of @var{width} columns instead of 80.
2012 On older systems, @command{fold} supports an obsolete option
2013 @option{-@var{width}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards
2014 conformance}) does not allow this; use @option{-w @var{width}}
2020 @node Output of parts of files
2021 @chapter Output of parts of files
2023 @cindex output of parts of files
2024 @cindex parts of files, output of
2026 These commands output pieces of the input.
2029 * head invocation:: Output the first part of files.
2030 * tail invocation:: Output the last part of files.
2031 * split invocation:: Split a file into fixed-size pieces.
2032 * csplit invocation:: Split a file into context-determined pieces.
2035 @node head invocation
2036 @section @command{head}: Output the first part of files
2039 @cindex initial part of files, outputting
2040 @cindex first part of files, outputting
2042 @command{head} prints the first part (10 lines by default) of each
2043 @var{file}; it reads from standard input if no files are given or
2044 when given a @var{file} of @option{-}. Synopsis:
2047 head [@var{option}]@dots{} [@var{file}]@dots{}
2050 If more than one @var{file} is specified, @command{head} prints a
2051 one-line header consisting of
2053 ==> @var{file name} <==
2056 before the output for each @var{file}.
2058 The program accepts the following options. Also see @ref{Common options}.
2062 @item -c @var{bytes}
2063 @itemx --bytes=@var{bytes}
2066 Print the first @var{bytes} bytes, instead of initial lines. Appending
2067 @samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and @samp{m}
2071 @itemx --lines=@var{n}
2074 Output the first @var{n} lines.
2082 Never print file name headers.
2088 Always print file name headers.
2092 On older systems, @command{head} supports an obsolete option
2093 @option{-@var{count}@var{options}}, which is recognized only if it is
2094 specified first. @var{count} is a decimal number optionally followed
2095 by a size letter (@samp{b}, @samp{k}, @samp{m}) as in @code{-c}, or
2096 @samp{l} to mean count by lines, or other option letters (@samp{cqv}).
2097 @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance}) does not allow
2098 this; use @option{-c @var{count}} or @option{-n @var{count}} instead.
2100 @node tail invocation
2101 @section @command{tail}: Output the last part of files
2104 @cindex last part of files, outputting
2106 @command{tail} prints the last part (10 lines by default) of each
2107 @var{file}; it reads from standard input if no files are given or
2108 when given a @var{file} of @samp{-}. Synopsis:
2111 tail [@var{option}]@dots{} [@var{file}]@dots{}
2114 If more than one @var{file} is specified, @command{tail} prints a
2115 one-line header consisting of
2117 ==> @var{file name} <==
2120 before the output for each @var{file}.
2122 @cindex BSD @command{tail}
2123 @sc{gnu} @command{tail} can output any amount of data (some other versions of
2124 @command{tail} cannot). It also has no @option{-r} option (print in
2125 reverse), since reversing a file is really a different job from printing
2126 the end of a file; BSD @command{tail} (which is the one with @code{-r}) can
2127 only reverse files that are at most as large as its buffer, which is
2128 typically 32 KiB. A more reliable and versatile way to reverse files is
2129 the @sc{gnu} @command{tac} command.
2131 If any option-argument is a number @var{n} starting with a @samp{+},
2132 @command{tail} begins printing with the @var{n}th item from the start of
2133 each file, instead of from the end.
2135 The program accepts the following options. Also see @ref{Common options}.
2139 @item -c @var{bytes}
2140 @itemx --bytes=@var{bytes}
2143 Output the last @var{bytes} bytes, instead of final lines. Appending
2144 @samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and @samp{m}
2148 @itemx --follow[=@var{how}]
2151 @cindex growing files
2152 @vindex name @r{follow option}
2153 @vindex descriptor @r{follow option}
2154 Loop forever trying to read more characters at the end of the file,
2155 presumably because the file is growing. This option is ignored when
2156 reading from a pipe.
2157 If more than one file is given, @command{tail} prints a header whenever it
2158 gets output from a different file, to indicate which file that output is
2161 There are two ways to specify how you'd like to track files with this option,
2162 but that difference is noticeable only when a followed file is removed or
2164 If you'd like to continue to track the end of a growing file even after
2165 it has been unlinked, use @option{--follow=descriptor}. This is the default
2166 behavior, but it is not useful if you're tracking a log file that may be
2167 rotated (removed or renamed, then reopened). In that case, use
2168 @option{--follow=name} to track the named file by reopening it periodically
2169 to see if it has been removed and recreated by some other program.
2171 No matter which method you use, if the tracked file is determined to have
2172 shrunk, @command{tail} prints a message saying the file has been truncated
2173 and resumes tracking the end of the file from the newly-determined endpoint.
2175 When a file is removed, @command{tail}'s behavior depends on whether it is
2176 following the name or the descriptor. When following by name, tail can
2177 detect that a file has been removed and gives a message to that effect,
2178 and if @option{--retry} has been specified it will continue checking
2179 periodically to see if the file reappears.
2180 When following a descriptor, tail does not detect that the file has
2181 been unlinked or renamed and issues no message; even though the file
2182 may no longer be accessible via its original name, it may still be
2185 The option values @samp{descriptor} and @samp{name} may be specified only
2186 with the long form of the option, not with @option{-f}.
2190 This option is the same as @option{--follow=name --retry}. That is, tail
2191 will attempt to reopen a file when it is removed. Should this fail, tail
2192 will keep trying until it becomes accessible again.
2196 This option is meaningful only when following by name.
2197 Without this option, when tail encounters a file that doesn't
2198 exist or is otherwise inaccessible, it reports that fact and
2199 never checks it again.
2201 @itemx --sleep-interval=@var{number}
2202 @opindex --sleep-interval
2203 Change the number of seconds to wait between iterations (the default is 1.0).
2204 During one iteration, every specified file is checked to see if it has
2205 Historical implementations of @command{tail} have required that
2206 @var{number} be an integer. However, GNU @command{tail} accepts
2207 an arbitrary floating point number.
2209 @itemx --pid=@var{pid}
2211 When following by name or by descriptor, you may specify the process ID,
2212 @var{pid}, of the sole writer of all @var{file} arguments. Then, shortly
2213 after that process terminates, tail will also terminate. This will
2214 work properly only if the writer and the tailing process are running on
2215 the same machine. For example, to save the output of a build in a file
2216 and to watch the file grow, if you invoke @code{make} and @command{tail}
2217 like this then the tail process will stop when your build completes.
2218 Without this option, you would have had to kill the @code{tail -f}
2221 $ make >& makerr & tail --pid=$! -f makerr
2223 If you specify a @var{pid} that is not in use or that does not correspond
2224 to the process that is writing to the tailed files, then @command{tail}
2225 may terminate long before any @var{file}s stop growing or it may not
2226 terminate until long after the real writer has terminated.
2227 Note that @option{--pid} cannot be supported on some systems; @command{tail}
2228 will print a warning if this is the case.
2230 @itemx --max-unchanged-stats=@var{n}
2231 @opindex --max-unchanged-stats
2232 When tailing a file by name, if there have been @var{n} (default
2233 n=@value{DEFAULT_MAX_N_UNCHANGED_STATS_BETWEEN_OPENS}) consecutive
2234 iterations for which the size has remained the same, then
2235 @code{open}/@code{fstat} the file to determine if that file name is
2236 still associated with the same device/inode-number pair as before.
2237 When following a log file that is rotated, this is approximately the
2238 number of seconds between when tail prints the last pre-rotation lines
2239 and when it prints the lines that have accumulated in the new log file.
2240 This option is meaningful only when following by name.
2243 @itemx --lines=@var{n}
2246 Output the last @var{n} lines.
2254 Never print file name headers.
2260 Always print file name headers.
2264 On older systems, @command{tail} supports an obsolete option
2265 @option{-@var{count}@var{options}}, which is recognized only if it is
2266 specified first. @var{count} is a decimal number optionally followed
2267 by a size letter (@samp{b}, @samp{k}, @samp{m}) as in @code{-c}, or
2268 @samp{l} to mean count by lines, or other option letters
2269 (@samp{cfqv}). Some older @command{tail} implementations also support
2270 an obsolete option @option{+@var{count}} with the same meaning as
2271 @option{-+@var{count}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards
2272 conformance}) does not allow these options; use @option{-c
2273 @var{count}} or @option{-n @var{count}} instead.
2275 @node split invocation
2276 @section @command{split}: Split a file into fixed-size pieces
2279 @cindex splitting a file into pieces
2280 @cindex pieces, splitting a file into
2282 @command{split} creates output files containing consecutive sections of
2283 @var{input} (standard input if none is given or @var{input} is
2284 @samp{-}). Synopsis:
2287 split [@var{option}] [@var{input} [@var{prefix}]]
2290 By default, @command{split} puts 1000 lines of @var{input} (or whatever is
2291 left over for the last section), into each output file.
2293 @cindex output file name prefix
2294 The output files' names consist of @var{prefix} (@samp{x} by default)
2295 followed by a group of letters (@samp{aa}, @samp{ab}, @dots{} by default),
2296 such that concatenating the output files in sorted order by file name produces
2297 the original input file. If the output file names are exhausted,
2298 @command{split} reports an error without deleting the output files
2301 The program accepts the following options. Also see @ref{Common options}.
2305 @item -a @var{length}
2306 @itemx --suffix-length=@var{length}
2308 @opindex --suffix-length
2309 Use suffixes of length @var{length}. The default @var{length} is 2.
2311 @item -l @var{lines}
2312 @itemx --lines=@var{lines}
2315 Put @var{lines} lines of @var{input} into each output file.
2317 On older systems, @command{split} supports an obsolete option
2318 @option{-@var{lines}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards
2319 conformance}) does not allow this; use @option{-l @var{lines}}
2322 @item -b @var{bytes}
2323 @itemx --bytes=@var{bytes}
2326 Put the first @var{bytes} bytes of @var{input} into each output file.
2327 Appending @samp{b} multiplies @var{bytes} by 512, @samp{k} by 1024, and
2328 @samp{m} by 1048576.
2330 @item -C @var{bytes}
2331 @itemx --line-bytes=@var{bytes}
2333 @opindex --line-bytes
2334 Put into each output file as many complete lines of @var{input} as
2335 possible without exceeding @var{bytes} bytes. For lines longer than
2336 @var{bytes} bytes, put @var{bytes} bytes into each output file until
2337 less than @var{bytes} bytes of the line are left, then continue
2338 normally. @var{bytes} has the same format as for the @option{--bytes}
2343 Write a diagnostic to standard error just before each output file is opened.
2348 @node csplit invocation
2349 @section @command{csplit}: Split a file into context-determined pieces
2352 @cindex context splitting
2353 @cindex splitting a file into pieces by context
2355 @command{csplit} creates zero or more output files containing sections of
2356 @var{input} (standard input if @var{input} is @samp{-}). Synopsis:
2359 csplit [@var{option}]@dots{} @var{input} @var{pattern}@dots{}
2362 The contents of the output files are determined by the @var{pattern}
2363 arguments, as detailed below. An error occurs if a @var{pattern}
2364 argument refers to a nonexistent line of the input file (e.g., if no
2365 remaining line matches a given regular expression). After every
2366 @var{pattern} has been matched, any remaining input is copied into one
2369 By default, @command{csplit} prints the number of bytes written to each
2370 output file after it has been created.
2372 The types of pattern arguments are:
2377 Create an output file containing the input up to but not including line
2378 @var{n} (a positive integer). If followed by a repeat count, also
2379 create an output file containing the next @var{line} lines of the input
2380 file once for each repeat.
2382 @item /@var{regexp}/[@var{offset}]
2383 Create an output file containing the current line up to (but not
2384 including) the next line of the input file that contains a match for
2385 @var{regexp}. The optional @var{offset} is a @samp{+} or @samp{-}
2386 followed by a positive integer. If it is given, the input up to the
2387 matching line plus or minus @var{offset} is put into the output file,
2388 and the line after that begins the next section of input.
2390 @item %@var{regexp}%[@var{offset}]
2391 Like the previous type, except that it does not create an output
2392 file, so that section of the input file is effectively ignored.
2394 @item @{@var{repeat-count}@}
2395 Repeat the previous pattern @var{repeat-count} additional
2396 times. @var{repeat-count} can either be a positive integer or an
2397 asterisk, meaning repeat as many times as necessary until the input is
2402 The output files' names consist of a prefix (@samp{xx} by default)
2403 followed by a suffix. By default, the suffix is an ascending sequence
2404 of two-digit decimal numbers from @samp{00} to @samp{99}. In any case,
2405 concatenating the output files in sorted order by filename produces the
2406 original input file.
2408 By default, if @command{csplit} encounters an error or receives a hangup,
2409 interrupt, quit, or terminate signal, it removes any output files
2410 that it has created so far before it exits.
2412 The program accepts the following options. Also see @ref{Common options}.
2416 @item -f @var{prefix}
2417 @itemx --prefix=@var{prefix}
2420 @cindex output file name prefix
2421 Use @var{prefix} as the output file name prefix.
2423 @item -b @var{suffix}
2424 @itemx --suffix=@var{suffix}
2427 @cindex output file name suffix
2428 Use @var{suffix} as the output file name suffix. When this option is
2429 specified, the suffix string must include exactly one
2430 @code{printf(3)}-style conversion specification, possibly including
2431 format specification flags, a field width, a precision specifications,
2432 or all of these kinds of modifiers. The format letter must convert a
2433 binary integer argument to readable form; thus, only @samp{d}, @samp{i},
2434 @samp{u}, @samp{o}, @samp{x}, and @samp{X} conversions are allowed. The
2435 entire @var{suffix} is given (with the current output file number) to
2436 @code{sprintf(3)} to form the file name suffixes for each of the
2437 individual output files in turn. If this option is used, the
2438 @option{--digits} option is ignored.
2440 @item -n @var{digits}
2441 @itemx --digits=@var{digits}
2444 Use output file names containing numbers that are @var{digits} digits
2445 long instead of the default 2.
2450 @opindex --keep-files
2451 Do not remove output files when errors are encountered.
2454 @itemx --elide-empty-files
2456 @opindex --elide-empty-files
2457 Suppress the generation of zero-length output files. (In cases where
2458 the section delimiters of the input file are supposed to mark the first
2459 lines of each of the sections, the first output file will generally be a
2460 zero-length file unless you use this option.) The output file sequence
2461 numbers always run consecutively starting from 0, even when this option
2472 Do not print counts of output file sizes.
2477 @node Summarizing files
2478 @chapter Summarizing files
2480 @cindex summarizing files
2482 These commands generate just a few numbers representing entire
2486 * wc invocation:: Print byte, word, and line counts.
2487 * sum invocation:: Print checksum and block counts.
2488 * cksum invocation:: Print CRC checksum and byte counts.
2489 * md5sum invocation:: Print or check message-digests.
2494 @section @code{wc}: Print byte, word, and line counts
2498 @cindex character count
2502 @code{wc} counts the number of bytes, characters, whitespace-separated
2503 words, and newlines in each given @var{file}, or standard input if none
2504 are given or for a @var{file} of @samp{-}. Synopsis:
2507 wc [@var{option}]@dots{} [@var{file}]@dots{}
2510 @cindex total counts
2511 @vindex POSIXLY_CORRECT
2512 @code{wc} prints one line of counts for each file, and if the file was
2513 given as an argument, it prints the file name following the counts. If
2514 more than one @var{file} is given, @code{wc} prints a final line
2515 containing the cumulative counts, with the file name @file{total}. The
2516 counts are printed in this order: newlines, words, characters, bytes.
2517 By default, each count is output right-justified in a 7-byte field with
2518 one space between fields so that the numbers and file names line up nicely
2519 in columns. However, @acronym{POSIX} requires that there be exactly one space
2520 separating columns. You can make @code{wc} use the @acronym{POSIX}-mandated
2521 output format by setting the @env{POSIXLY_CORRECT} environment variable.
2523 By default, @code{wc} prints three counts: the newline, words, and byte
2524 counts. Options can specify that only certain counts be printed.
2525 Options do not undo others previously given, so
2532 prints both the byte counts and the word counts.
2534 With the @code{--max-line-length} option, @code{wc} prints the length
2535 of the longest line per file, and if there is more than one file it
2536 prints the maximum (not the sum) of those lengths.
2538 The program accepts the following options. Also see @ref{Common options}.
2546 Print only the byte counts.
2552 Print only the character counts.
2558 Print only the word counts.
2564 Print only the newline counts.
2567 @itemx --max-line-length
2569 @opindex --max-line-length
2570 Print only the maximum line lengths.
2575 @node sum invocation
2576 @section @command{sum}: Print checksum and block counts
2579 @cindex 16-bit checksum
2580 @cindex checksum, 16-bit
2582 @command{sum} computes a 16-bit checksum for each given @var{file}, or
2583 standard input if none are given or for a @var{file} of @samp{-}. Synopsis:
2586 sum [@var{option}]@dots{} [@var{file}]@dots{}
2589 @command{sum} prints the checksum for each @var{file} followed by the
2590 number of blocks in the file (rounded up). If more than one @var{file}
2591 is given, file names are also printed (by default). (With the
2592 @option{--sysv} option, corresponding file names are printed when there is
2593 at least one file argument.)
2595 By default, @sc{gnu} @command{sum} computes checksums using an algorithm
2596 compatible with BSD @command{sum} and prints file sizes in units of
2599 The program accepts the following options. Also see @ref{Common options}.
2605 @cindex BSD @command{sum}
2606 Use the default (BSD compatible) algorithm. This option is included for
2607 compatibility with the System V @command{sum}. Unless @option{-s} was also
2608 given, it has no effect.
2614 @cindex System V @command{sum}
2615 Compute checksums using an algorithm compatible with System V
2616 @command{sum}'s default, and print file sizes in units of 512-byte blocks.
2620 @command{sum} is provided for compatibility; the @code{cksum} program (see
2621 next section) is preferable in new applications.
2624 @node cksum invocation
2625 @section @command{cksum}: Print CRC checksum and byte counts
2628 @cindex cyclic redundancy check
2629 @cindex CRC checksum
2631 @command{cksum} computes a cyclic redundancy check (CRC) checksum for each
2632 given @var{file}, or standard input if none are given or for a
2633 @var{file} of @samp{-}. Synopsis:
2636 cksum [@var{option}]@dots{} [@var{file}]@dots{}
2639 @command{cksum} prints the CRC checksum for each file along with the number
2640 of bytes in the file, and the filename unless no arguments were given.
2642 @command{cksum} is typically used to ensure that files
2643 transferred by unreliable means (e.g., netnews) have not been corrupted,
2644 by comparing the @command{cksum} output for the received files with the
2645 @command{cksum} output for the original files (typically given in the
2648 The CRC algorithm is specified by the @acronym{POSIX} standard. It is not
2649 compatible with the BSD or System V @command{sum} algorithms (see the
2650 previous section); it is more robust.
2652 The only options are @option{--help} and @option{--version}. @xref{Common
2656 @node md5sum invocation
2657 @section @command{md5sum}: Print or check message-digests
2660 @cindex 128-bit checksum
2661 @cindex checksum, 128-bit
2662 @cindex fingerprint, 128-bit
2663 @cindex message-digest, 128-bit
2665 @command{md5sum} computes a 128-bit checksum (or @dfn{fingerprint} or
2666 @dfn{message-digest}) for each specified @var{file}.
2667 If a @var{file} is specified as @samp{-} or if no files are given
2668 @command{md5sum} computes the checksum for the standard input.
2669 @command{md5sum} can also determine whether a file and checksum are
2670 consistent. Synopses:
2673 md5sum [@var{option}]@dots{} [@var{file}]@dots{}
2674 md5sum [@var{option}]@dots{} --check [@var{file}]
2677 For each @var{file}, @samp{md5sum} outputs the MD5 checksum, a flag
2678 indicating a binary or text input file, and the filename.
2679 If @var{file} is omitted or specified as @samp{-}, standard input is read.
2681 The program accepts the following options. Also see @ref{Common options}.
2689 @cindex binary input files
2690 Treat all input files as binary. This option has no effect on Unix
2691 systems, since they don't distinguish between binary and text files.
2692 This option is useful on systems that have different internal and
2693 external character representations. On MS-DOS and MS-Windows, this is
2698 Read filenames and checksum information from the single @var{file}
2699 (or from stdin if no @var{file} was specified) and report whether
2700 each named file and the corresponding checksum data are consistent.
2701 The input to this mode of @command{md5sum} is usually the output of
2702 a prior, checksum-generating run of @samp{md5sum}.
2703 Each valid line of input consists of an MD5 checksum, a binary/text
2704 flag, and then a filename.
2705 Binary files are marked with @samp{*}, text with @samp{ }.
2706 For each such line, @command{md5sum} reads the named file and computes its
2707 MD5 checksum. Then, if the computed message digest does not match the
2708 one on the line with the filename, the file is noted as having
2709 failed the test. Otherwise, the file passes the test.
2710 By default, for each valid line, one line is written to standard
2711 output indicating whether the named file passed the test.
2712 After all checks have been performed, if there were any failures,
2713 a warning is issued to standard error.
2714 Use the @option{--status} option to inhibit that output.
2715 If any listed file cannot be opened or read, if any valid line has
2716 an MD5 checksum inconsistent with the associated file, or if no valid
2717 line is found, @command{md5sum} exits with nonzero status. Otherwise,
2718 it exits successfully.
2722 @cindex verifying MD5 checksums
2723 This option is useful only when verifying checksums.
2724 When verifying checksums, don't generate the default one-line-per-file
2725 diagnostic and don't output the warning summarizing any failures.
2726 Failures to open or read a file still evoke individual diagnostics to
2728 If all listed files are readable and are consistent with the associated
2729 MD5 checksums, exit successfully. Otherwise exit with a status code
2730 indicating there was a failure.
2736 @cindex text input files
2737 Treat all input files as text files. This is the reverse of
2744 @cindex verifying MD5 checksums
2745 When verifying checksums, warn about improperly formatted MD5 checksum lines.
2746 This option is useful only if all but a few lines in the checked input
2752 @node Operating on sorted files
2753 @chapter Operating on sorted files
2755 @cindex operating on sorted files
2756 @cindex sorted files, operations on
2758 These commands work with (or produce) sorted files.
2761 * sort invocation:: Sort text files.
2762 * uniq invocation:: Uniquify files.
2763 * comm invocation:: Compare two sorted files line by line.
2764 * ptx invocation:: Produce a permuted index of file contents.
2765 * tsort invocation:: Topological sort.
2766 * tsort background:: Where tsort came from.
2770 @node sort invocation
2771 @section @command{sort}: Sort text files
2774 @cindex sorting files
2776 @command{sort} sorts, merges, or compares all the lines from the given
2777 files, or standard input if none are given or for a @var{file} of
2778 @samp{-}. By default, @command{sort} writes the results to standard
2782 sort [@var{option}]@dots{} [@var{file}]@dots{}
2785 @command{sort} has three modes of operation: sort (the default), merge,
2786 and check for sortedness. The following options change the operation
2795 @cindex checking for sortedness
2796 Check whether the given files are already sorted: if they are not all
2797 sorted, print an error message and exit with a status of 1.
2798 Otherwise, exit successfully.
2804 @cindex merging sorted files
2805 Merge the given files by sorting them as a group. Each input file must
2806 always be individually sorted. It always works to sort instead of
2807 merge; merging is provided because it is faster, in the case where it
2814 A pair of lines is compared as follows: if any key fields have
2815 been specified, @command{sort} compares each pair of fields, in the
2816 order specified on the command line, according to the associated
2817 ordering options, until a difference is found or no fields are left.
2818 Unless otherwise specified, all comparisons use the character collating
2819 sequence specified by the @env{LC_COLLATE} locale. @footnote{If you
2820 use a non-@acronym{POSIX} locale (e.g., by setting @env{LC_ALL}
2821 to @samp{en_US}), then @command{sort} may produce output that is sorted
2822 differently than you're accustomed to. In that case, set the @env{LC_ALL}
2823 environment variable to @samp{C}. Note that setting only @env{LC_COLLATE}
2824 has two problems. First, it is ineffective if @env{LC_ALL} is also set.
2825 Second, it has undefined behavior if @env{LC_CTYPE} (or @env{LANG}, if
2826 @env{LC_CTYPE} is unset) is set to an incompatible value. For example,
2827 you get undefined behavior if @env{LC_CTYPE} is @code{ja_JP.PCK} but
2828 @env{LC_COLLATE} is @code{en_US.UTF-8}. }
2830 If any of the global options @samp{bdfgiMnr} are given but no key fields
2831 are specified, @command{sort} compares the entire lines according to the
2834 Finally, as a last resort when all keys compare equal (or if no ordering
2835 options were specified at all), @command{sort} compares the entire lines.
2836 The last resort comparison honors the @option{--reverse} (@option{-r})
2837 global option. The @option{--stable} (@option{-s}) option disables this
2838 last-resort comparison so that lines in which all fields compare equal
2839 are left in their original relative order. If no fields or global
2840 options are specified, @option{--stable} (@option{-s}) has no effect.
2842 @sc{gnu} @command{sort} (as specified for all @sc{gnu} utilities) has no limits on
2843 input line length or restrictions on bytes allowed within lines. In
2844 addition, if the final byte of an input file is not a newline, @sc{gnu}
2845 @command{sort} silently supplies one. A line's trailing newline is not
2846 part of the line for comparison purposes.
2848 Upon any error, @command{sort} exits with a status of @samp{2}.
2851 If the environment variable @env{TMPDIR} is set, @command{sort} uses its
2852 value as the directory for temporary files instead of @file{/tmp}. The
2853 @option{--temporary-directory} (@option{-T}) option in turn overrides
2854 the environment variable.
2857 The following options affect the ordering of output lines. They may be
2858 specified globally or as part of a specific key field. If no key
2859 fields are specified, global options apply to comparison of entire
2860 lines; otherwise the global options are inherited by key fields that do
2861 not specify any special options of their own. In pre-@acronym{POSIX}
2862 versions of @command{sort}, global options affect only later key fields,
2863 so portable shell scripts should specify global options first.
2868 @itemx --ignore-leading-blanks
2870 @opindex --ignore-leading-blanks
2871 @cindex blanks, ignoring leading
2873 Ignore leading blanks when finding sort keys in each line.
2874 The @env{LC_CTYPE} locale determines character types.
2877 @itemx --dictionary-order
2879 @opindex --dictionary-order
2880 @cindex dictionary order
2881 @cindex phone directory order
2882 @cindex telephone directory order
2884 Sort in @dfn{phone directory} order: ignore all characters except
2885 letters, digits and blanks when sorting.
2886 The @env{LC_CTYPE} locale determines character types.
2889 @itemx --ignore-case
2891 @opindex --ignore-case
2892 @cindex ignoring case
2893 @cindex case folding
2895 Fold lowercase characters into the equivalent uppercase characters when
2896 comparing so that, for example, @samp{b} and @samp{B} sort as equal.
2897 The @env{LC_CTYPE} locale determines character types.
2900 @itemx --general-numeric-sort
2902 @opindex --general-numeric-sort
2903 @cindex general numeric sort
2905 Sort numerically, using the standard C function @code{strtod} to convert
2906 a prefix of each line to a double-precision floating point number.
2907 This allows floating point numbers to be specified in scientific notation,
2908 like @code{1.0e-34} and @code{10e100}.
2909 The @env{LC_NUMERIC} locale determines the decimal-point character.
2910 Do not report overflow, underflow, or conversion errors.
2911 Use the following collating sequence:
2915 Lines that do not start with numbers (all considered to be equal).
2917 NaNs (``Not a Number'' values, in IEEE floating point arithmetic)
2918 in a consistent but machine-dependent order.
2922 Finite numbers in ascending numeric order (with @math{-0} and @math{+0} equal).
2927 Use this option only if there is no alternative; it is much slower than
2928 @option{--numeric-sort} (@option{-n}) and it can lose information when
2929 converting to floating point.
2932 @itemx --ignore-nonprinting
2934 @opindex --ignore-nonprinting
2935 @cindex nonprinting characters, ignoring
2936 @cindex unprintable characters, ignoring
2938 Ignore nonprinting characters.
2939 The @env{LC_CTYPE} locale determines character types.
2944 @opindex --month-sort
2945 @cindex months, sorting by
2947 An initial string, consisting of any amount of whitespace, followed
2948 by a month name abbreviation, is folded to UPPER case and
2949 compared in the order @samp{JAN} < @samp{FEB} < @dots{} < @samp{DEC}.
2950 Invalid names compare low to valid names. The @env{LC_TIME} locale
2951 category determines the month spellings.
2954 @itemx --numeric-sort
2956 @opindex --numeric-sort
2957 @cindex numeric sort
2959 Sort numerically: the number begins each line; specifically, it consists
2960 of optional whitespace, an optional @samp{-} sign, and zero or more
2961 digits possibly separated by thousands separators, optionally followed
2962 by a decimal-point character and zero or more digits. The @env{LC_NUMERIC}
2963 locale specifies the decimal-point character and thousands separator.
2965 Numeric sort uses what might be considered an unconventional method to
2966 compare strings representing floating point numbers. Rather than first
2967 converting each string to the C @code{double} type and then comparing
2968 those values, @command{sort} aligns the decimal-point characters in the
2969 two strings and compares the strings a character at a time. One benefit
2970 of using this approach is its speed. In practice this is much more
2971 efficient than performing the two corresponding string-to-double (or
2972 even string-to-integer) conversions and then comparing doubles. In
2973 addition, there is no corresponding loss of precision. Converting each
2974 string to @code{double} before comparison would limit precision to about
2975 16 digits on most systems.
2977 Neither a leading @samp{+} nor exponential notation is recognized.
2978 To compare such strings numerically, use the
2979 @option{--general-numeric-sort} (@option{-g}) option.
2985 @cindex reverse sorting
2986 Reverse the result of comparison, so that lines with greater key values
2987 appear earlier in the output instead of later.
2995 @item -o @var{output-file}
2996 @itemx --output=@var{output-file}
2999 @cindex overwriting of input, allowed
3000 Write output to @var{output-file} instead of standard output.
3001 If necessary, @command{sort} reads input before opening
3002 @var{output-file}, so you can safely sort a file in place by using
3003 commands like @code{sort -o F F} and @code{cat F | sort -o F}.
3005 @vindex POSIXLY_CORRECT
3006 On newer systems, @option{-o} cannot appear after an input file if
3007 @env{POSIXLY_CORRECT} is set, e.g., @samp{sort F -o F}. Portable
3008 scripts should specify @option{-o @var{output-file}} before any input
3012 @itemx --buffer-size=@var{size}
3014 @opindex --buffer-size
3015 @cindex size for main memory sorting
3016 Use a main-memory sort buffer of the given @var{size}. By default,
3017 @var{size} is in units of 1024 bytes. Appending @samp{%} causes
3018 @var{size} to be interpreted as a percentage of physical memory.
3019 Appending @samp{K} multiplies @var{size} by 1024 (the default),
3020 @samp{M} by 1,048,576, @samp{G} by 1,073,741,824, and so on for
3021 @samp{T}, @samp{P}, @samp{E}, @samp{Z}, and @samp{Y}. Appending
3022 @samp{b} causes @var{size} to be interpreted as a byte count, with no
3025 This option can improve the performance of @command{sort} by causing it
3026 to start with a larger or smaller sort buffer than the default.
3027 However, this option affects only the initial buffer size. The buffer
3028 grows beyond @var{size} if @command{sort} encounters input lines larger
3031 @item -t @var{separator}
3032 @itemx --field-separator=@var{separator}
3034 @opindex --field-separator
3035 @cindex field separator character
3036 Use character @var{separator} as the field separator when finding the
3037 sort keys in each line. By default, fields are separated by the empty
3038 string between a non-whitespace character and a whitespace character.
3039 That is, given the input line @w{@samp{ foo bar}}, @command{sort} breaks it
3040 into fields @w{@samp{ foo}} and @w{@samp{ bar}}. The field separator is
3041 not considered to be part of either the field preceding or the field
3042 following. But note that sort fields that extend to the end of the line,
3043 as @option{-k 2}, or sort fields consisting of a range, as @option{-k 2,3},
3044 retain the field separators present between the endpoints of the range.
3046 @item -T @var{tempdir}
3047 @itemx --temporary-directory=@var{tempdir}
3049 @opindex --temporary-directory
3050 @cindex temporary directory
3052 Use directory @var{tempdir} to store temporary files, overriding the
3053 @env{TMPDIR} environment variable. If this option is given more than
3054 once, temporary files are stored in all the directories given. If you
3055 have a large sort or merge that is I/O-bound, you can often improve
3056 performance by using this option to specify directories on different
3057 disks and controllers.
3063 @cindex uniquifying output
3065 Normally, output only the first of a sequence of lines that compare
3066 equal. For the @option{--check} (@option{-c}) option,
3067 check that no pair of consecutive lines compares equal.
3069 @item -k @var{pos1}[,@var{pos2}]
3070 @itemx --key=@var{pos1}[,@var{pos2}]
3074 Specify a sort field that consists of the part of the line between
3075 @var{pos1} and @var{pos2} (or the end of the line, if @var{pos2} is
3076 omitted), @emph{inclusive}. Fields and character positions are numbered
3077 starting with 1. So to sort on the second field, you'd use
3078 @option{--key=2,2} (@option{-k 2,2}). See below for more examples.
3081 @itemx --zero-terminated
3083 @opindex --zero-terminated
3084 @cindex sort zero-terminated lines
3085 Treat the input as a set of lines, each terminated by a zero byte
3086 (@acronym{ASCII} @sc{nul} (Null) character) instead of an
3087 @acronym{ASCII} @sc{lf} (Line Feed).
3088 This option can be useful in conjunction with @samp{perl -0} or
3089 @samp{find -print0} and @samp{xargs -0} which do the same in order to
3090 reliably handle arbitrary pathnames (even those which contain Line Feed
3095 Historical (BSD and System V) implementations of @command{sort} have
3096 differed in their interpretation of some options, particularly
3097 @option{-b}, @option{-f}, and @option{-n}. @sc{gnu} sort follows the @acronym{POSIX}
3098 behavior, which is usually (but not always!) like the System V behavior.
3099 According to @acronym{POSIX}, @option{-n} no longer implies @option{-b}. For
3100 consistency, @option{-M} has been changed in the same way. This may
3101 affect the meaning of character positions in field specifications in
3102 obscure cases. The only fix is to add an explicit @option{-b}.
3104 A position in a sort field specified with the @option{-k}
3105 option has the form @samp{@var{f}.@var{c}}, where @var{f} is the number
3106 of the field to use and @var{c} is the number of the first character
3107 from the beginning of the field. In a start position, an omitted
3108 @samp{.@var{c}} stands for the field's first character. In an end
3109 position, an omitted or zero @samp{.@var{c}} stands for the field's
3110 last character. If the
3111 @option{-b} option was specified, the @samp{.@var{c}} part of a field
3112 specification is counted from the first nonblank character of the field.
3114 A sort key position may also have any of the option letters @samp{Mbdfinr}
3115 appended to it, in which case the global ordering options are not used
3116 for that particular field. The @option{-b} option may be independently
3117 attached to either or both of the start and
3118 end positions of a field specification, and if it is inherited
3119 from the global options it will be attached to both.
3120 Keys may span multiple fields.
3122 On older systems, @command{sort} supports an obsolete origin-zero
3123 syntax @samp{+@var{pos1} [-@var{pos2}]} for specifying sort keys.
3124 @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance}) does not allow
3125 this; use @option{-k} instead.
3127 Here are some examples to illustrate various combinations of options.
3132 Sort in descending (reverse) numeric order.
3139 Sort alphabetically, omitting the first and second fields.
3140 This uses a single key composed of the characters beginning
3141 at the start of field three and extending to the end of each line.
3148 Sort numerically on the second field and resolve ties by sorting
3149 alphabetically on the third and fourth characters of field five.
3150 Use @samp{:} as the field delimiter.
3153 sort -t : -k 2,2n -k 5.3,5.4
3156 Note that if you had written @option{-k 2} instead of @option{-k 2,2}
3157 @command{sort} would have used all characters beginning in the second field
3158 and extending to the end of the line as the primary @emph{numeric}
3159 key. For the large majority of applications, treating keys spanning
3160 more than one field as numeric will not do what you expect.
3162 Also note that the @samp{n} modifier was applied to the field-end
3163 specifier for the first key. It would have been equivalent to
3164 specify @option{-k 2n,2} or @option{-k 2n,2n}. All modifiers except
3165 @samp{b} apply to the associated @emph{field}, regardless of whether
3166 the modifier character is attached to the field-start and/or the
3167 field-end part of the key specifier.
3170 Sort the password file on the fifth field and ignore any
3171 leading white space. Sort lines with equal values in field five
3172 on the numeric user ID in field three.
3175 sort -t : -k 5b,5 -k 3,3n /etc/passwd
3178 An alternative is to use the global numeric modifier @option{-n}.
3181 sort -t : -n -k 5b,5 -k 3,3 /etc/passwd
3185 Generate a tags file in case-insensitive sorted order.
3188 find src -type f -print0 | sort -t / -z -f | xargs -0 etags --append
3191 The use of @option{-print0}, @option{-z}, and @option{-0} in this case means
3192 that pathnames that contain Line Feed characters will not get broken up
3193 by the sort operation.
3195 Finally, to ignore both leading and trailing white space, you
3196 could have applied the @samp{b} modifier to the field-end specifier
3200 sort -t : -n -k 5b,5b -k 3,3 /etc/passwd
3203 or by using the global @option{-b} modifier instead of @option{-n}
3204 and an explicit @samp{n} with the second key specifier.
3207 sort -t : -b -k 5,5 -k 3,3n /etc/passwd
3210 @c This example is a bit contrived and needs more explanation.
3212 @c Sort records separated by an arbitrary string by using a pipe to convert
3213 @c each record delimiter string to @samp{\0}, then using sort's -z option,
3214 @c and converting each @samp{\0} back to the original record delimiter.
3217 @c printf 'c\n\nb\n\na\n'|perl -0pe 's/\n\n/\n\0/g'|sort -z|perl -0pe 's/\0/\n/g'
3223 @node uniq invocation
3224 @section @command{uniq}: Uniquify files
3227 @cindex uniquify files
3229 @command{uniq} writes the unique lines in the given @file{input}, or
3230 standard input if nothing is given or for an @var{input} name of
3234 uniq [@var{option}]@dots{} [@var{input} [@var{output}]]
3237 By default, @command{uniq} prints the unique lines in a sorted file, i.e.,
3238 discards all but one of identical successive lines. Optionally, it can
3239 instead show only lines that appear exactly once, or lines that appear
3242 The input need not be sorted, but duplicate input lines are detected
3243 only if they are adjacent. If you want to discard non-adjacent
3244 duplicate lines, perhaps you want to use @code{sort -u}.
3247 Comparisons use the character collating sequence specified by the
3248 @env{LC_COLLATE} locale category.
3250 If no @var{output} file is specified, @command{uniq} writes to standard
3253 The program accepts the following options. Also see @ref{Common options}.
3258 @itemx --skip-fields=@var{n}
3260 @opindex --skip-fields
3261 Skip @var{n} fields on each line before checking for uniqueness. Fields
3262 are sequences of non-space non-tab characters that are separated from
3263 each other by at least one space or tab.
3265 On older systems, @command{uniq} supports an obsolete option
3266 @option{-@var{n}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance})
3267 does not allow this; use @option{-f @var{n}} instead.
3270 @itemx --skip-chars=@var{n}
3272 @opindex --skip-chars
3273 Skip @var{n} characters before checking for uniqueness. If you use both
3274 the field and character skipping options, fields are skipped over first.
3276 On older systems, @command{uniq} supports an obsolete option
3277 @option{+@var{n}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance})
3278 does not allow this; use @option{-s @var{n}} instead.
3284 Print the number of times each line occurred along with the line.
3287 @itemx --ignore-case
3289 @opindex --ignore-case
3290 Ignore differences in case when comparing lines.
3296 @cindex duplicate lines, outputting
3297 Print one copy of each duplicate line.
3300 @itemx --all-repeated[=@var{delimit-method}]
3302 @opindex --all-repeated
3303 @cindex all duplicate lines, outputting
3304 Print all copies of each duplicate line.
3305 This option is useful mainly in conjunction with other options e.g.,
3306 to ignore case or to compare only selected fields.
3307 The optional @var{delimit-method} tells how to delimit
3308 groups of duplicate lines, and must be one of the following:
3313 Do not delimit groups of duplicate lines.
3314 This is equivalent to @option{--all-repeated} (@option{-D}).
3317 Output a newline before each group of duplicate lines.
3320 Separate groups of duplicate lines with a single newline.
3321 This is the same as using @samp{prepend}, except that
3322 there is no newline before the first group, and hence
3323 may be better suited for output direct to users.
3326 Note that when groups are delimited and the input stream contains
3327 two or more consecutive blank lines, then the output is ambiguous.
3328 To avoid that, filter the input through @samp{tr -s '\n'} to replace
3329 each sequence of consecutive newlines with a single newline.
3331 This is a @sc{gnu} extension.
3332 @c FIXME: give an example showing *how* it's useful
3338 @cindex unique lines, outputting
3339 Print non-duplicate lines.
3342 @itemx --check-chars=@var{n}
3344 @opindex --check-chars
3345 Compare @var{n} characters on each line (after skipping any specified
3346 fields and characters). By default the entire rest of the lines are
3352 @node comm invocation
3353 @section @command{comm}: Compare two sorted files line by line
3356 @cindex line-by-line comparison
3357 @cindex comparing sorted files
3359 @command{comm} writes to standard output lines that are common, and lines
3360 that are unique, to two input files; a file name of @samp{-} means
3361 standard input. Synopsis:
3364 comm [@var{option}]@dots{} @var{file1} @var{file2}
3368 Before @command{comm} can be used, the input files must be sorted using the
3369 collating sequence specified by the @env{LC_COLLATE} locale.
3370 If an input file ends in a non-newline
3371 character, a newline is silently appended. The @command{sort} command with
3372 no options always outputs a file that is suitable input to @command{comm}.
3374 @cindex differing lines
3375 @cindex common lines
3376 With no options, @command{comm} produces three column output. Column one
3377 contains lines unique to @var{file1}, column two contains lines unique
3378 to @var{file2}, and column three contains lines common to both files.
3379 Columns are separated by a single TAB character.
3380 @c FIXME: when there's an option to supply an alternative separator
3381 @c string, append `by default' to the above sentence.
3386 The options @option{-1}, @option{-2}, and @option{-3} suppress printing of
3387 the corresponding columns. Also see @ref{Common options}.
3389 Unlike some other comparison utilities, @command{comm} has an exit
3390 status that does not depend on the result of the comparison.
3391 Upon normal completion @command{comm} produces an exit code of zero.
3392 If there is an error it exits with nonzero status.
3395 @node tsort invocation
3396 @section @command{tsort}: Topological sort
3399 @cindex topological sort
3401 @command{tsort} performs a topological sort on the given @var{file}, or
3402 standard input if no input file is given or for a @var{file} of
3403 @samp{-}. For more details and some history, see @ref{tsort background}.
3407 tsort [@var{option}] [@var{file}]
3410 @command{tsort} reads its input as pairs of strings, separated by blanks,
3411 indicating a partial ordering. The output is a total ordering that
3412 corresponds to the given partial ordering.
3426 will produce the output
3437 Consider a more realistic example.
3438 You have a large set of functions all in one file, and they may all be
3439 declared static except one. Currently that one (say @code{main}) is the
3440 first function defined in the file, and the ones it calls directly follow
3441 it, followed by those they call, etc. Let's say that you are determined
3442 to take advantage of prototypes, so you have to choose between declaring
3443 all of those functions (which means duplicating a lot of information from
3444 the definitions) and rearranging the functions so that as many as possible
3445 are defined before they are used. One way to automate the latter process
3446 is to get a list for each function of the functions it calls directly.
3447 Many programs can generate such lists. They describe a call graph.
3448 Consider the following list, in which a given line indicates that the
3449 function on the left calls the one on the right directly.
3455 tail_file pretty_name
3456 tail_file write_header
3458 tail_forever recheck
3459 tail_forever pretty_name
3460 tail_forever write_header
3461 tail_forever dump_remainder
3464 tail_lines start_lines
3465 tail_lines dump_remainder
3466 tail_lines file_lines
3467 tail_lines pipe_lines
3469 tail_bytes start_bytes
3470 tail_bytes dump_remainder
3471 tail_bytes pipe_bytes
3472 file_lines dump_remainder
3476 then you can use @command{tsort} to produce an ordering of those
3477 functions that satisfies your requirement.
3480 example$ tsort call-graph | tac
3500 @command{tsort} detects any cycles in the input and writes the first cycle
3501 encountered to standard error.
3503 Note that for a given partial ordering, generally there is no unique
3504 total ordering. In the context of the call graph above, the function
3505 @code{parse_options} may be placed anywhere in the list as long as it
3506 precedes @code{main}.
3508 The only options are @option{--help} and @option{--version}. @xref{Common
3511 @node tsort background
3512 @section @command{tsort}: Background
3514 @command{tsort} exists because very early versions of the Unix linker processed
3515 an archive file exactly once, and in order. As @code{ld} read each object in
3516 the archive, it decided whether it was needed in the program based on
3517 whether it defined any symbols which were undefined at that point in
3520 This meant that dependencies within the archive had to be handled
3521 specially. For example, @code{scanf} probably calls @code{read}. That means
3522 that in a single pass through an archive, it was important for @code{scanf.o}
3523 to appear before read.o, because otherwise a program which calls
3524 @code{scanf} but not @code{read} might end up with an unexpected unresolved
3525 reference to @code{read}.
3527 The way to address this problem was to first generate a set of
3528 dependencies of one object file on another. This was done by a shell
3529 script called @code{lorder}. The GNU tools don't provide a version of
3530 lorder, as far as I know, but you can still find it in BSD
3533 Then you ran @command{tsort} over the @code{lorder} output, and you used the
3534 resulting sort to define the order in which you added objects to the archive.
3536 This whole procedure has been obsolete since about 1980, because
3537 Unix archives now contain a symbol table (traditionally built by
3538 @code{ranlib}, now generally built by @code{ar} itself), and the Unix
3539 linker uses the symbol table to effectively make multiple passes over
3542 Anyhow, that's where tsort came from. To solve an old problem with
3543 the way the linker handled archive files, which has since been solved
3546 @node ptx invocation
3547 @section @command{ptx}: Produce permuted indexes
3551 @command{ptx} reads a text file and essentially produces a permuted index, with
3552 each keyword in its context. The calling sketch is either one of:
3555 ptx [@var{option} @dots{}] [@var{file} @dots{}]
3556 ptx -G [@var{option} @dots{}] [@var{input} [@var{output}]]
3559 The @option{-G} (or its equivalent: @option{--traditional}) option disables
3560 all @sc{gnu} extensions and reverts to traditional mode, thus introducing some
3561 limitations and changing several of the program's default option values.
3562 When @option{-G} is not specified, @sc{gnu} extensions are always enabled.
3563 @sc{gnu} extensions to @command{ptx} are documented wherever appropriate in this
3564 document. For the full list, see @xref{Compatibility in ptx}.
3566 Individual options are explained in the following sections.
3568 When @sc{gnu} extensions are enabled, there may be zero, one or several
3569 @var{file}s after the options. If there is no @var{file}, the program
3570 reads the standard input. If there is one or several @var{file}s, they
3571 give the name of input files which are all read in turn, as if all the
3572 input files were concatenated. However, there is a full contextual
3573 break between each file and, when automatic referencing is requested,
3574 file names and line numbers refer to individual text input files. In
3575 all cases, the program outputs the permuted index to the standard
3578 When @sc{gnu} extensions are @emph{not} enabled, that is, when the program
3579 operates in traditional mode, there may be zero, one or two parameters
3580 besides the options. If there are no parameters, the program reads the
3581 standard input and outputs the permuted index to the standard output.
3582 If there is only one parameter, it names the text @var{input} to be read
3583 instead of the standard input. If two parameters are given, they give
3584 respectively the name of the @var{input} file to read and the name of
3585 the @var{output} file to produce. @emph{Be very careful} to note that,
3586 in this case, the contents of file given by the second parameter is
3587 destroyed. This behavior is dictated by System V @command{ptx}
3588 compatibility; @sc{gnu} Standards normally discourage output parameters not
3589 introduced by an option.
3591 Note that for @emph{any} file named as the value of an option or as an
3592 input text file, a single dash @kbd{-} may be used, in which case
3593 standard input is assumed. However, it would not make sense to use this
3594 convention more than once per program invocation.
3597 * General options in ptx:: Options which affect general program behavior.
3598 * Charset selection in ptx:: Underlying character set considerations.
3599 * Input processing in ptx:: Input fields, contexts, and keyword selection.
3600 * Output formatting in ptx:: Types of output format, and sizing the fields.
3601 * Compatibility in ptx::
3605 @node General options in ptx
3606 @subsection General options
3612 Print a short note about the copyright and copying conditions, then
3613 exit without further processing.
3616 @itemx --traditional
3617 As already explained, this option disables all @sc{gnu} extensions to
3618 @command{ptx} and switches to traditional mode.
3621 Print a short help on standard output, then exit without further
3625 Print the program version on standard output, then exit without further
3631 @node Charset selection in ptx
3632 @subsection Charset selection
3634 @c FIXME: People don't necessarily know what an IBM-PC was these days.
3635 As it is set up now, the program assumes that the input file is coded
3636 using 8-bit ISO 8859-1 code, also known as Latin-1 character set,
3637 @emph{unless} it is compiled for MS-DOS, in which case it uses the
3638 character set of the IBM-PC. (@sc{gnu} @command{ptx} is not known to work on
3639 smaller MS-DOS machines anymore.) Compared to 7-bit @acronym{ASCII}, the set
3640 of characters which are letters is different; this alters the behavior
3641 of regular expression matching. Thus, the default regular expression
3642 for a keyword allows foreign or diacriticized letters. Keyword sorting,
3643 however, is still crude; it obeys the underlying character set ordering
3649 @itemx --ignore-case
3650 Fold lower case letters to upper case for sorting.
3655 @node Input processing in ptx
3656 @subsection Word selection and input processing
3661 @item --break-file=@var{file}
3663 This option provides an alternative (to @option{-W}) method of describing
3664 which characters make up words. It introduces the name of a
3665 file which contains a list of characters which can@emph{not} be part of
3666 one word; this file is called the @dfn{Break file}. Any character which
3667 is not part of the Break file is a word constituent. If both options
3668 @option{-b} and @option{-W} are specified, then @option{-W} has precedence and
3669 @option{-b} is ignored.
3671 When @sc{gnu} extensions are enabled, the only way to avoid newline as a
3672 break character is to write all the break characters in the file with no
3673 newline at all, not even at the end of the file. When @sc{gnu} extensions
3674 are disabled, spaces, tabs and newlines are always considered as break
3675 characters even if not included in the Break file.
3678 @itemx --ignore-file=@var{file}
3680 The file associated with this option contains a list of words which will
3681 never be taken as keywords in concordance output. It is called the
3682 @dfn{Ignore file}. The file contains exactly one word in each line; the
3683 end of line separation of words is not subject to the value of the
3686 There is a default Ignore file used by @command{ptx} when this option is
3687 not specified, usually found in @file{/usr/local/lib/eign} if this has
3688 not been changed at installation time. If you want to deactivate the
3689 default Ignore file, specify @code{/dev/null} instead.
3692 @itemx --only-file=@var{file}
3694 The file associated with this option contains a list of words which will
3695 be retained in concordance output; any word not mentioned in this file
3696 is ignored. The file is called the @dfn{Only file}. The file contains
3697 exactly one word in each line; the end of line separation of words is
3698 not subject to the value of the @option{-S} option.
3700 There is no default for the Only file. When both an Only file and an
3701 Ignore file are specified, a word is considered a keyword only
3702 if it is listed in the Only file and not in the Ignore file.
3707 On each input line, the leading sequence of non-white space characters will be
3708 taken to be a reference that has the purpose of identifying this input
3709 line in the resulting permuted index. For more information about reference
3710 production, see @xref{Output formatting in ptx}.
3711 Using this option changes the default value for option @option{-S}.
3713 Using this option, the program does not try very hard to remove
3714 references from contexts in output, but it succeeds in doing so
3715 @emph{when} the context ends exactly at the newline. If option
3716 @option{-r} is used with @option{-S} default value, or when @sc{gnu} extensions
3717 are disabled, this condition is always met and references are completely
3718 excluded from the output contexts.
3720 @item -S @var{regexp}
3721 @itemx --sentence-regexp=@var{regexp}
3723 This option selects which regular expression will describe the end of a
3724 line or the end of a sentence. In fact, this regular expression is not
3725 the only distinction between end of lines or end of sentences, and input
3726 line boundaries have no special significance outside this option. By
3727 default, when @sc{gnu} extensions are enabled and if @option{-r} option is not
3728 used, end of sentences are used. In this case, this @var{regex} is
3729 imported from @sc{gnu} Emacs:
3732 [.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]*
3735 Whenever @sc{gnu} extensions are disabled or if @option{-r} option is used, end
3736 of lines are used; in this case, the default @var{regexp} is just:
3742 Using an empty @var{regexp} is equivalent to completely disabling end of
3743 line or end of sentence recognition. In this case, the whole file is
3744 considered to be a single big line or sentence. The user might want to
3745 disallow all truncation flag generation as well, through option @option{-F
3746 ""}. @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
3749 When the keywords happen to be near the beginning of the input line or
3750 sentence, this often creates an unused area at the beginning of the
3751 output context line; when the keywords happen to be near the end of the
3752 input line or sentence, this often creates an unused area at the end of
3753 the output context line. The program tries to fill those unused areas
3754 by wrapping around context in them; the tail of the input line or
3755 sentence is used to fill the unused area on the left of the output line;
3756 the head of the input line or sentence is used to fill the unused area
3757 on the right of the output line.
3759 As a matter of convenience to the user, many usual backslashed escape
3760 sequences from the C language are recognized and converted to the
3761 corresponding characters by @command{ptx} itself.
3763 @item -W @var{regexp}
3764 @itemx --word-regexp=@var{regexp}
3766 This option selects which regular expression will describe each keyword.
3767 By default, if @sc{gnu} extensions are enabled, a word is a sequence of
3768 letters; the @var{regexp} used is @samp{\w+}. When @sc{gnu} extensions are
3769 disabled, a word is by default anything which ends with a space, a tab
3770 or a newline; the @var{regexp} used is @samp{[^ \t\n]+}.
3772 An empty @var{regexp} is equivalent to not using this option.
3773 @xref{Regexps, , Syntax of Regular Expressions, emacs, The GNU Emacs
3776 As a matter of convenience to the user, many usual backslashed escape
3777 sequences, as found in the C language, are recognized and converted to
3778 the corresponding characters by @command{ptx} itself.
3783 @node Output formatting in ptx
3784 @subsection Output formatting
3786 Output format is mainly controlled by the @option{-O} and @option{-T} options
3787 described in the table below. When neither @option{-O} nor @option{-T} are
3788 selected, and if @sc{gnu} extensions are enabled, the program chooses an
3789 output format suitable for a dumb terminal. Each keyword occurrence is
3790 output to the center of one line, surrounded by its left and right
3791 contexts. Each field is properly justified, so the concordance output
3792 can be readily observed. As a special feature, if automatic
3793 references are selected by option @option{-A} and are output before the
3794 left context, that is, if option @option{-R} is @emph{not} selected, then
3795 a colon is added after the reference; this nicely interfaces with @sc{gnu}
3796 Emacs @code{next-error} processing. In this default output format, each
3797 white space character, like newline and tab, is merely changed to
3798 exactly one space, with no special attempt to compress consecutive
3799 spaces. This might change in the future. Except for those white space
3800 characters, every other character of the underlying set of 256
3801 characters is transmitted verbatim.
3803 Output format is further controlled by the following options.
3807 @item -g @var{number}
3808 @itemx --gap-size=@var{number}
3810 Select the size of the minimum white space gap between the fields on the
3813 @item -w @var{number}
3814 @itemx --width=@var{number}
3816 Select the maximum output width of each final line. If references are
3817 used, they are included or excluded from the maximum output width
3818 depending on the value of option @option{-R}. If this option is not
3819 selected, that is, when references are output before the left context,
3820 the maximum output width takes into account the maximum length of all
3821 references. If this option is selected, that is, when references are
3822 output after the right context, the maximum output width does not take
3823 into account the space taken by references, nor the gap that precedes
3827 @itemx --auto-reference
3829 Select automatic references. Each input line will have an automatic
3830 reference made up of the file name and the line ordinal, with a single
3831 colon between them. However, the file name will be empty when standard
3832 input is being read. If both @option{-A} and @option{-r} are selected, then
3833 the input reference is still read and skipped, but the automatic
3834 reference is used at output time, overriding the input reference.
3837 @itemx --right-side-refs
3839 In the default output format, when option @option{-R} is not used, any
3840 references produced by the effect of options @option{-r} or @option{-A} are
3841 placed to the far right of output lines, after the right context. With
3842 default output format, when the @option{-R} option is specified, references
3843 are rather placed at the beginning of each output line, before the left
3844 context. For any other output format, option @option{-R} is
3845 ignored, with one exception: with @option{-R} the width of references
3846 is @emph{not} taken into account in total output width given by @option{-w}.
3848 This option is automatically selected whenever @sc{gnu} extensions are
3851 @item -F @var{string}
3852 @itemx --flac-truncation=@var{string}
3854 This option will request that any truncation in the output be reported
3855 using the string @var{string}. Most output fields theoretically extend
3856 towards the beginning or the end of the current line, or current
3857 sentence, as selected with option @option{-S}. But there is a maximum
3858 allowed output line width, changeable through option @option{-w}, which is
3859 further divided into space for various output fields. When a field has
3860 to be truncated because it cannot extend beyond the beginning or the end of
3861 the current line to fit in, then a truncation occurs. By default,
3862 the string used is a single slash, as in @option{-F /}.
3864 @var{string} may have more than one character, as in @option{-F ...}.
3865 Also, in the particular case when @var{string} is empty (@option{-F ""}),
3866 truncation flagging is disabled, and no truncation marks are appended in
3869 As a matter of convenience to the user, many usual backslashed escape
3870 sequences, as found in the C language, are recognized and converted to
3871 the corresponding characters by @command{ptx} itself.
3873 @item -M @var{string}
3874 @itemx --macro-name=@var{string}
3876 Select another @var{string} to be used instead of @samp{xx}, while
3877 generating output suitable for @code{nroff}, @code{troff} or @TeX{}.
3880 @itemx --format=roff
3882 Choose an output format suitable for @code{nroff} or @code{troff}
3883 processing. Each output line will look like:
3886 .xx "@var{tail}" "@var{before}" "@var{keyword_and_after}" "@var{head}" "@var{ref}"
3889 so it will be possible to write a @samp{.xx} roff macro to take care of
3890 the output typesetting. This is the default output format when @sc{gnu}
3891 extensions are disabled. Option @option{-M} can be used to change
3892 @samp{xx} to another macro name.
3894 In this output format, each non-graphical character, like newline and
3895 tab, is merely changed to exactly one space, with no special attempt to
3896 compress consecutive spaces. Each quote character: @kbd{"} is doubled
3897 so it will be correctly processed by @code{nroff} or @code{troff}.
3902 Choose an output format suitable for @TeX{} processing. Each output
3903 line will look like:
3906 \xx @{@var{tail}@}@{@var{before}@}@{@var{keyword}@}@{@var{after}@}@{@var{head}@}@{@var{ref}@}
3910 so it will be possible to write a @code{\xx} definition to take care of
3911 the output typesetting. Note that when references are not being
3912 produced, that is, neither option @option{-A} nor option @option{-r} is
3913 selected, the last parameter of each @code{\xx} call is inhibited.
3914 Option @option{-M} can be used to change @samp{xx} to another macro
3917 In this output format, some special characters, like @kbd{$}, @kbd{%},
3918 @kbd{&}, @kbd{#} and @kbd{_} are automatically protected with a
3919 backslash. Curly brackets @kbd{@{}, @kbd{@}} are protected with a
3920 backslash and a pair of dollar signs (to force mathematical mode). The
3921 backslash itself produces the sequence @code{\backslash@{@}}.
3922 Circumflex and tilde diacritical marks produce the sequence @code{^\@{ @}} and
3923 @code{~\@{ @}} respectively. Other diacriticized characters of the
3924 underlying character set produce an appropriate @TeX{} sequence as far
3925 as possible. The other non-graphical characters, like newline and tab,
3926 and all other characters which are not part of @acronym{ASCII}, are merely
3927 changed to exactly one space, with no special attempt to compress
3928 consecutive spaces. Let me know how to improve this special character
3929 processing for @TeX{}.
3934 @node Compatibility in ptx
3935 @subsection The @sc{gnu} extensions to @command{ptx}
3937 This version of @command{ptx} contains a few features which do not exist in
3938 System V @command{ptx}. These extra features are suppressed by using the
3939 @option{-G} command line option, unless overridden by other command line
3940 options. Some @sc{gnu} extensions cannot be recovered by overriding, so the
3941 simple rule is to avoid @option{-G} if you care about @sc{gnu} extensions.
3942 Here are the differences between this program and System V @command{ptx}.
3947 This program can read many input files at once, it always writes the
3948 resulting concordance on standard output. On the other hand, System V
3949 @command{ptx} reads only one file and sends the result to standard output
3950 or, if a second @var{file} parameter is given on the command, to that
3953 Having output parameters not introduced by options is a dangerous
3954 practice which @sc{gnu} avoids as far as possible. So, for using @command{ptx}
3955 portably between @sc{gnu} and System V, you should always use it with a
3956 single input file, and always expect the result on standard output. You
3957 might also want to automatically configure in a @option{-G} option to
3958 @command{ptx} calls in products using @code{ptx}, if the configurator finds
3959 that the installed @command{ptx} accepts @option{-G}.
3962 The only options available in System V @command{ptx} are options @option{-b},
3963 @option{-f}, @option{-g}, @option{-i}, @option{-o}, @option{-r}, @option{-t} and
3964 @option{-w}. All other options are @sc{gnu} extensions and are not repeated in
3965 this enumeration. Moreover, some options have a slightly different
3966 meaning when @sc{gnu} extensions are enabled, as explained below.
3969 By default, concordance output is not formatted for @code{troff} or
3970 @code{nroff}. It is rather formatted for a dumb terminal. @code{troff}
3971 or @code{nroff} output may still be selected through option @option{-O}.
3974 Unless @option{-R} option is used, the maximum reference width is
3975 subtracted from the total output line width. With @sc{gnu} extensions
3976 disabled, width of references is not taken into account in the output
3977 line width computations.
3980 All 256 characters, even @kbd{NUL}s, are always read and processed from
3981 input file with no adverse effect, even if @sc{gnu} extensions are disabled.
3982 However, System V @command{ptx} does not accept 8-bit characters, a few
3983 control characters are rejected, and the tilde @kbd{~} is also rejected.
3986 Input line length is only limited by available memory, even if @sc{gnu}
3987 extensions are disabled. However, System V @command{ptx} processes only
3988 the first 200 characters in each line.
3991 The break (non-word) characters default to be every character except all
3992 letters of the underlying character set, diacriticized or not. When @sc{gnu}
3993 extensions are disabled, the break characters default to space, tab and
3997 The program makes better use of output line width. If @sc{gnu} extensions
3998 are disabled, the program rather tries to imitate System V @command{ptx},
3999 but still, there are some slight disposition glitches this program does
4000 not completely reproduce.
4003 The user can specify both an Ignore file and an Only file. This is not
4004 allowed with System V @command{ptx}.
4009 @node Operating on fields within a line
4010 @chapter Operating on fields within a line
4013 * cut invocation:: Print selected parts of lines.
4014 * paste invocation:: Merge lines of files.
4015 * join invocation:: Join lines on a common field.
4019 @node cut invocation
4020 @section @command{cut}: Print selected parts of lines
4023 @command{cut} writes to standard output selected parts of each line of each
4024 input file, or standard input if no files are given or for a file name of
4028 cut [@var{option}]@dots{} [@var{file}]@dots{}
4031 In the table which follows, the @var{byte-list}, @var{character-list},
4032 and @var{field-list} are one or more numbers or ranges (two numbers
4033 separated by a dash) separated by commas. Bytes, characters, and
4034 fields are numbered starting at 1. Incomplete ranges may be
4035 given: @option{-@var{m}} means @samp{1-@var{m}}; @samp{@var{n}-} means
4036 @samp{@var{n}} through end of line or last field. The list elements
4037 can be repeated, can overlap, and can be specified in any order; but
4038 the selected input is written in the same order that it is read, and
4039 is written exactly once.
4041 The program accepts the following options. Also see @ref{Common
4046 @item -b @var{byte-list}
4047 @itemx --bytes=@var{byte-list}
4050 Print only the bytes in positions listed in @var{byte-list}. Tabs and
4051 backspaces are treated like any other character; they take up 1 byte.
4052 If an output delimiter is specified, (see the description of
4053 @option{--output-delimiter}), then output that string between
4054 ranges of selected bytes.
4056 @item -c @var{character-list}
4057 @itemx --characters=@var{character-list}
4059 @opindex --characters
4060 Print only characters in positions listed in @var{character-list}.
4061 The same as @option{-b} for now, but internationalization will change
4062 that. Tabs and backspaces are treated like any other character; they
4063 take up 1 character.
4064 If an output delimiter is specified, (see the description of
4065 @option{--output-delimiter}), then output that string between
4066 ranges of selected bytes.
4068 @item -f @var{field-list}
4069 @itemx --fields=@var{field-list}
4072 Print only the fields listed in @var{field-list}. Fields are
4073 separated by a TAB character by default.
4074 Also print any line that contains no delimiter character, unless
4075 the @option{--only-delimited} (@option{-s}) option is specified
4077 @item -d @var{input_delim_byte}
4078 @itemx --delimiter=@var{input_delim_byte}
4080 @opindex --delimiter
4081 For @option{-f}, fields are separated in the input by the first character
4082 in @var{input_delim_byte} (default is TAB).
4086 Do not split multi-byte characters (no-op for now).
4089 @itemx --only-delimited
4091 @opindex --only-delimited
4092 For @option{-f}, do not print lines that do not contain the field separator
4093 character. Normally, any line without a field separator is printed verbatim.
4095 @itemx --output-delimiter=@var{output_delim_string}
4096 @opindex --output-delimiter
4097 With @option{-f}, output fields are separated by @var{output_delim_string}.
4098 The default with @option{-f} is to use the input delimiter.
4099 When using @option{-b} or @option{-c} to select ranges of byte or
4100 character offsets (as opposed to ranges of fields),
4101 output @var{output_delim_string} between ranges of selected bytes.
4107 @node paste invocation
4108 @section @command{paste}: Merge lines of files
4111 @cindex merging files
4113 @command{paste} writes to standard output lines consisting of sequentially
4114 corresponding lines of each given file, separated by a TAB character.
4115 Standard input is used for a file name of @samp{-} or if no input files
4137 paste [@var{option}]@dots{} [@var{file}]@dots{}
4140 The program accepts the following options. Also see @ref{Common options}.
4148 Paste the lines of one file at a time rather than one line from each
4149 file. Using the above example data:
4152 $ paste -s num2 let3
4157 @item -d @var{delim-list}
4158 @itemx --delimiters=@var{delim-list}
4160 @opindex --delimiters
4161 Consecutively use the characters in @var{delim-list} instead of
4162 TAB to separate merged lines. When @var{delim-list} is
4163 exhausted, start again at its beginning. Using the above example data:
4166 $ paste -d '%_' num2 let3 num2
4175 @node join invocation
4176 @section @command{join}: Join lines on a common field
4179 @cindex common field, joining on
4181 @command{join} writes to standard output a line for each pair of input
4182 lines that have identical join fields. Synopsis:
4185 join [@var{option}]@dots{} @var{file1} @var{file2}
4189 Either @var{file1} or @var{file2} (but not both) can be @samp{-},
4190 meaning standard input. @var{file1} and @var{file2} should be
4191 sorted on the join fields.
4193 Normally, the sort order is that of the
4194 collating sequence specified by the @env{LC_COLLATE} locale. Unless
4195 the @option{-t} option is given, the sort comparison ignores blanks at
4196 the start of the join field, as in @code{sort -b}. If the
4197 @option{--ignore-case} option is given, the sort comparison ignores
4198 the case of characters in the join field, as in @code{sort -f}.
4200 However, as a GNU extension, if the input has no unpairable lines the
4201 sort order can be any order that considers two fields to be equal if and
4202 only if the sort comparison described above considers them to be equal.
4220 The defaults are: the join field is the first field in each line;
4221 fields in the input are separated by one or more blanks, with leading
4222 blanks on the line ignored; fields in the output are separated by a
4223 space; each output line consists of the join field, the remaining
4224 fields from @var{file1}, then the remaining fields from @var{file2}.
4226 The program accepts the following options. Also see @ref{Common options}.
4230 @item -a @var{file-number}
4232 Print a line for each unpairable line in file @var{file-number} (either
4233 @samp{1} or @samp{2}), in addition to the normal output.
4235 @item -e @var{string}
4237 Replace those output fields that are missing in the input with
4241 @itemx --ignore-case
4243 @opindex --ignore-case
4244 Ignore differences in case when comparing keys.
4245 With this option, the lines of the input files must be ordered in the same way.
4246 Use @samp{sort -f} to produce this ordering.
4248 @item -1 @var{field}
4249 @itemx -j1 @var{field}
4252 Join on field @var{field} (a positive integer) of file 1.
4254 @item -2 @var{field}
4255 @itemx -j2 @var{field}
4258 Join on field @var{field} (a positive integer) of file 2.
4260 @item -j @var{field}
4261 Equivalent to @option{-1 @var{field} -2 @var{field}}.
4263 @item -o @var{field-list}@dots{}
4264 Construct each output line according to the format in @var{field-list}.
4265 Each element in @var{field-list} is either the single character @samp{0} or
4266 has the form @var{m.n} where the file number, @var{m}, is @samp{1} or
4267 @samp{2} and @var{n} is a positive field number.
4269 A field specification of @samp{0} denotes the join field.
4270 In most cases, the functionality of the @samp{0} field spec
4271 may be reproduced using the explicit @var{m.n} that corresponds
4272 to the join field. However, when printing unpairable lines
4273 (using either of the @option{-a} or @option{-v} options), there is no way
4274 to specify the join field using @var{m.n} in @var{field-list}
4275 if there are unpairable lines in both files.
4276 To give @command{join} that functionality, @acronym{POSIX} invented the @samp{0}
4277 field specification notation.
4279 The elements in @var{field-list}
4280 are separated by commas or blanks. Multiple @var{field-list}
4281 arguments can be given after a single @option{-o} option; the values
4282 of all lists given with @option{-o} are concatenated together.
4283 All output lines -- including those printed because of any -a or -v
4284 option -- are subject to the specified @var{field-list}.
4287 Use character @var{char} as the input and output field separator.
4289 @item -v @var{file-number}
4290 Print a line for each unpairable line in file @var{file-number}
4291 (either @samp{1} or @samp{2}), instead of the normal output.
4295 In addition, when @sc{gnu} @command{join} is invoked with exactly one argument,
4296 the @option{--help} and @option{--version} options are recognized.
4297 @xref{Common options}.
4300 @node Operating on characters
4301 @chapter Operating on characters
4303 @cindex operating on characters
4305 This commands operate on individual characters.
4308 * tr invocation:: Translate, squeeze, and/or delete characters.
4309 * expand invocation:: Convert tabs to spaces.
4310 * unexpand invocation:: Convert spaces to tabs.
4315 @section @command{tr}: Translate, squeeze, and/or delete characters
4322 tr [@var{option}]@dots{} @var{set1} [@var{set2}]
4325 @command{tr} copies standard input to standard output, performing
4326 one of the following operations:
4330 translate, and optionally squeeze repeated characters in the result,
4332 squeeze repeated characters,
4336 delete characters, then squeeze repeated characters from the result.
4339 The @var{set1} and (if given) @var{set2} arguments define ordered
4340 sets of characters, referred to below as @var{set1} and @var{set2}. These
4341 sets are the characters of the input that @command{tr} operates on.
4342 The @option{--complement} (@option{-c}) option replaces @var{set1} with its
4343 complement (all of the characters that are not in @var{set1}).
4346 * Character sets:: Specifying sets of characters.
4347 * Translating:: Changing one characters to another.
4348 * Squeezing:: Squeezing repeats and deleting.
4349 * Warnings in tr:: Warning messages.
4353 @node Character sets
4354 @subsection Specifying sets of characters
4356 @cindex specifying sets of characters
4358 The format of the @var{set1} and @var{set2} arguments resembles
4359 the format of regular expressions; however, they are not regular
4360 expressions, only lists of characters. Most characters simply
4361 represent themselves in these strings, but the strings can contain
4362 the shorthands listed below, for convenience. Some of them can be
4363 used only in @var{set1} or @var{set2}, as noted below.
4367 @item Backslash escapes
4368 @cindex backslash escapes
4370 A backslash followed by a character not listed below causes an error
4389 The character with the value given by @var{ooo}, which is 1 to 3
4398 The notation @samp{@var{m}-@var{n}} expands to all of the characters
4399 from @var{m} through @var{n}, in ascending order. @var{m} should
4400 collate before @var{n}; if it doesn't, an error results. As an example,
4401 @samp{0-9} is the same as @samp{0123456789}.
4403 @sc{gnu} @command{tr} does not support the System V syntax that uses square
4404 brackets to enclose ranges. Translations specified in that format
4405 sometimes work as expected, since the brackets are often transliterated
4406 to themselves. However, they should be avoided because they sometimes
4407 behave unexpectedly. For example, @samp{tr -d '[0-9]'} deletes brackets
4410 Many historically common and even accepted uses of ranges are not
4411 portable. For example, on @acronym{EBCDIC} hosts using the @samp{A-Z}
4412 range will not do what most would expect because @samp{A} through @samp{Z}
4413 are not contiguous as they are in @acronym{ASCII}.
4414 If you can rely on a @acronym{POSIX} compliant version of @command{tr}, then
4415 the best way to work around this is to use character classes (see below).
4416 Otherwise, it is most portable (and most ugly) to enumerate the members
4419 @item Repeated characters
4420 @cindex repeated characters
4422 The notation @samp{[@var{c}*@var{n}]} in @var{set2} expands to @var{n}
4423 copies of character @var{c}. Thus, @samp{[y*6]} is the same as
4424 @samp{yyyyyy}. The notation @samp{[@var{c}*]} in @var{string2} expands
4425 to as many copies of @var{c} as are needed to make @var{set2} as long as
4426 @var{set1}. If @var{n} begins with @samp{0}, it is interpreted in
4427 octal, otherwise in decimal.
4429 @item Character classes
4430 @cindex character classes
4432 The notation @samp{[:@var{class}:]} expands to all of the characters in
4433 the (predefined) class @var{class}. The characters expand in no
4434 particular order, except for the @code{upper} and @code{lower} classes,
4435 which expand in ascending order. When the @option{--delete} (@option{-d})
4436 and @option{--squeeze-repeats} (@option{-s}) options are both given, any
4437 character class can be used in @var{set2}. Otherwise, only the
4438 character classes @code{lower} and @code{upper} are accepted in
4439 @var{set2}, and then only if the corresponding character class
4440 (@code{upper} and @code{lower}, respectively) is specified in the same
4441 relative position in @var{set1}. Doing this specifies case conversion.
4442 The class names are given below; an error results when an invalid class
4454 Horizontal whitespace.
4463 Printable characters, not including space.
4469 Printable characters, including space.
4472 Punctuation characters.
4475 Horizontal or vertical whitespace.
4484 @item Equivalence classes
4485 @cindex equivalence classes
4487 The syntax @samp{[=@var{c}=]} expands to all of the characters that are
4488 equivalent to @var{c}, in no particular order. Equivalence classes are
4489 a relatively recent invention intended to support non-English alphabets.
4490 But there seems to be no standard way to define them or determine their
4491 contents. Therefore, they are not fully implemented in @sc{gnu} @command{tr};
4492 each character's equivalence class consists only of that character,
4493 which is of no particular use.
4499 @subsection Translating
4501 @cindex translating characters
4503 @command{tr} performs translation when @var{set1} and @var{set2} are
4504 both given and the @option{--delete} (@option{-d}) option is not given.
4505 @command{tr} translates each character of its input that is in @var{set1}
4506 to the corresponding character in @var{set2}. Characters not in
4507 @var{set1} are passed through unchanged. When a character appears more
4508 than once in @var{set1} and the corresponding characters in @var{set2}
4509 are not all the same, only the final one is used. For example, these
4510 two commands are equivalent:
4517 A common use of @command{tr} is to convert lowercase characters to
4518 uppercase. This can be done in many ways. Here are three of them:
4521 tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
4523 tr '[:lower:]' '[:upper:]'
4527 But note that using ranges like @code{a-z} above is not portable.
4529 When @command{tr} is performing translation, @var{set1} and @var{set2}
4530 typically have the same length. If @var{set1} is shorter than
4531 @var{set2}, the extra characters at the end of @var{set2} are ignored.
4533 On the other hand, making @var{set1} longer than @var{set2} is not
4534 portable; @acronym{POSIX} says that the result is undefined. In this situation,
4535 BSD @command{tr} pads @var{set2} to the length of @var{set1} by repeating
4536 the last character of @var{set2} as many times as necessary. System V
4537 @command{tr} truncates @var{set1} to the length of @var{set2}.
4539 By default, @sc{gnu} @command{tr} handles this case like BSD @code{tr}. When
4540 the @option{--truncate-set1} (@option{-t}) option is given, @sc{gnu} @command{tr}
4541 handles this case like the System V @command{tr} instead. This option is
4542 ignored for operations other than translation.
4544 Acting like System V @command{tr} in this case breaks the relatively common
4548 tr -cs A-Za-z0-9 '\012'
4552 because it converts only zero bytes (the first element in the
4553 complement of @var{set1}), rather than all non-alphanumerics, to
4557 By the way, the above idiom is not portable because it uses ranges.
4558 Assuming a @acronym{POSIX} compliant @command{tr}, here is a better way to write it:
4561 tr -cs '[:alnum:]' '[\n*]'
4566 @subsection Squeezing repeats and deleting
4568 @cindex squeezing repeat characters
4569 @cindex deleting characters
4571 When given just the @option{--delete} (@option{-d}) option, @command{tr}
4572 removes any input characters that are in @var{set1}.
4574 When given just the @option{--squeeze-repeats} (@option{-s}) option,
4575 @command{tr} replaces each input sequence of a repeated character that
4576 is in @var{set1} with a single occurrence of that character.
4578 When given both @option{--delete} and @option{--squeeze-repeats}, @command{tr}
4579 first performs any deletions using @var{set1}, then squeezes repeats
4580 from any remaining characters using @var{set2}.
4582 The @option{--squeeze-repeats} option may also be used when translating,
4583 in which case @command{tr} first performs translation, then squeezes
4584 repeats from any remaining characters using @var{set2}.
4586 Here are some examples to illustrate various combinations of options:
4591 Remove all zero bytes:
4598 Put all words on lines by themselves. This converts all
4599 non-alphanumeric characters to newlines, then squeezes each string
4600 of repeated newlines into a single newline:
4603 tr -cs '[:alnum:]' '[\n*]'
4607 Convert each sequence of repeated newlines to a single newline:
4614 Find doubled occurrences of words in a document.
4615 For example, people often write ``the the'' with the duplicated words
4616 separated by a newline. The bourne shell script below works first
4617 by converting each sequence of punctuation and blank characters to a
4618 single newline. That puts each ``word'' on a line by itself.
4619 Next it maps all uppercase characters to lower case, and finally it
4620 runs @command{uniq} with the @option{-d} option to print out only the words
4621 that were adjacent duplicates.
4626 | tr -s '[:punct:][:blank:]' '\n' \
4627 | tr '[:upper:]' '[:lower:]' \
4632 Deleting a small set of characters is usually straightforward. For example,
4633 to remove all @samp{a}s, @samp{x}s, and @samp{M}s you would do this:
4639 However, when @samp{-} is one of those characters, it can be tricky because
4640 @samp{-} has special meanings. Performing the same task as above but also
4641 removing all @samp{-} characters, we might try @code{tr -d -axM}, but
4642 that would fail because @command{tr} would try to interpret @option{-a} as
4643 a command-line option. Alternatively, we could try putting the hyphen
4644 inside the string, @code{tr -d a-xM}, but that wouldn't work either because
4645 it would make @command{tr} interpret @code{a-x} as the range of characters
4646 @samp{a}@dots{}@samp{x} rather than the three.
4647 One way to solve the problem is to put the hyphen at the end of the list
4654 More generally, use the character class notation @code{[=c=]}
4655 with @samp{-} (or any other character) in place of the @samp{c}:
4661 Note how single quotes are used in the above example to protect the
4662 square brackets from interpretation by a shell.
4667 @node Warnings in tr
4668 @subsection Warning messages
4670 @vindex POSIXLY_CORRECT
4671 Setting the environment variable @env{POSIXLY_CORRECT} turns off the
4672 following warning and error messages, for strict compliance with
4673 @acronym{POSIX}. Otherwise, the following diagnostics are issued:
4678 When the @option{--delete} option is given but @option{--squeeze-repeats}
4679 is not, and @var{set2} is given, @sc{gnu} @command{tr} by default prints
4680 a usage message and exits, because @var{set2} would not be used.
4681 The @acronym{POSIX} specification says that @var{set2} must be ignored in
4682 this case. Silently ignoring arguments is a bad idea.
4685 When an ambiguous octal escape is given. For example, @samp{\400}
4686 is actually @samp{\40} followed by the digit @samp{0}, because the
4687 value 400 octal does not fit into a single byte.
4691 @sc{gnu} @command{tr} does not provide complete BSD or System V compatibility.
4692 For example, it is impossible to disable interpretation of the @acronym{POSIX}
4693 constructs @samp{[:alpha:]}, @samp{[=c=]}, and @samp{[c*10]}. Also, @sc{gnu}
4694 @command{tr} does not delete zero bytes automatically, unlike traditional
4695 Unix versions, which provide no way to preserve zero bytes.
4698 @node expand invocation
4699 @section @command{expand}: Convert tabs to spaces
4702 @cindex tabs to spaces, converting
4703 @cindex converting tabs to spaces
4705 @command{expand} writes the contents of each given @var{file}, or standard
4706 input if none are given or for a @var{file} of @samp{-}, to standard
4707 output, with tab characters converted to the appropriate number of
4711 expand [@var{option}]@dots{} [@var{file}]@dots{}
4714 By default, @command{expand} converts all tabs to spaces. It preserves
4715 backspace characters in the output; they decrement the column count for
4716 tab calculations. The default action is equivalent to @option{-t 8} (set
4717 tabs every 8 columns).
4719 The program accepts the following options. Also see @ref{Common options}.
4723 @item -t @var{tab1}[,@var{tab2}]@dots{}
4724 @itemx --tabs=@var{tab1}[,@var{tab2}]@dots{}
4727 @cindex tabstops, setting
4728 If only one tab stop is given, set the tabs @var{tab1} spaces apart
4729 (default is 8). Otherwise, set the tabs at columns @var{tab1},
4730 @var{tab2}, @dots{} (numbered from 0), and replace any tabs beyond the
4731 last tabstop given with single spaces. Tabstops can be separated by
4732 blanks as well as by commas.
4734 On older systems, @command{expand} supports an obsolete option
4735 @option{-@var{tab1}[,@var{tab2}]@dots{}}, where tabstops must be
4736 separated by commas. @acronym{POSIX} 1003.1-2001 (@pxref{Standards
4737 conformance}) does not allow this; use @option{-t
4738 @var{tab1}[,@var{tab2}]@dots{}} instead.
4744 @cindex initial tabs, converting
4745 Only convert initial tabs (those that precede all non-space or non-tab
4746 characters) on each line to spaces.
4751 @node unexpand invocation
4752 @section @command{unexpand}: Convert spaces to tabs
4756 @command{unexpand} writes the contents of each given @var{file}, or
4757 standard input if none are given or for a @var{file} of @samp{-}, to
4758 standard output, with strings of two or more space or tab characters
4759 converted to as many tabs as possible followed by as many spaces as are
4763 unexpand [@var{option}]@dots{} [@var{file}]@dots{}
4766 By default, @command{unexpand} converts only initial spaces and tabs (those
4767 that precede all non space or tab characters) on each line. It
4768 preserves backspace characters in the output; they decrement the column
4769 count for tab calculations. By default, tabs are set at every 8th
4772 The program accepts the following options. Also see @ref{Common options}.
4776 @item -t @var{tab1}[,@var{tab2}]@dots{}
4777 @itemx --tabs=@var{tab1}[,@var{tab2}]@dots{}
4780 If only one tab stop is given, set the tabs @var{tab1} spaces apart
4781 instead of the default 8. Otherwise, set the tabs at columns
4782 @var{tab1}, @var{tab2}, @dots{} (numbered from 0), and leave spaces and
4783 tabs beyond the tabstops given unchanged. Tabstops can be separated by
4784 blanks as well as by commas. This option implies the @option{-a} option.
4786 On older systems, @command{unexpand} supports an obsolete option
4787 @option{-@var{tab1}[,@var{tab2}]@dots{}}, where tabstops must be
4788 separated by commas. (Unlike @option{-t}, this obsolete option does
4789 not imply @option{-a}.) @acronym{POSIX} 1003.1-2001 (@pxref{Standards
4790 conformance}) does not allow this; use @option{--first-only -t
4791 @var{tab1}[,@var{tab2}]@dots{}} instead.
4797 Convert all strings of two or more spaces or tabs, not just initial
4803 @node Directory listing
4804 @chapter Directory listing
4806 This chapter describes the @command{ls} command and its variants @code{dir}
4807 and @command{vdir}, which list information about files.
4810 * ls invocation:: List directory contents.
4811 * dir invocation:: Briefly ls.
4812 * vdir invocation:: Verbosely ls.
4813 * dircolors invocation:: Color setup for ls, etc.
4818 @section @command{ls}: List directory contents
4821 @cindex directory listing
4823 The @command{ls} program lists information about files (of any type,
4824 including directories). Options and file arguments can be intermixed
4825 arbitrarily, as usual.
4827 For non-option command-line arguments that are directories, by default
4828 @command{ls} lists the contents of directories, not recursively, and
4829 omitting files with names beginning with @samp{.}. For other non-option
4830 arguments, by default @command{ls} lists just the file name. If no
4831 non-option argument is specified, @command{ls} operates on the current
4832 directory, acting as if it had been invoked with a single argument of @samp{.}.
4835 By default, the output is sorted alphabetically, according to the locale
4836 settings in effect. @footnote{If you use a non-@acronym{POSIX}
4837 locale (e.g., by setting @env{LC_ALL} to @samp{en_US}), then @command{ls} may
4838 produce output that is sorted differently than you're accustomed to.
4839 In that case, set the @env{LC_ALL} environment variable to @samp{C}.}
4840 If standard output is
4841 a terminal, the output is in columns (sorted vertically) and control
4842 characters are output as question marks; otherwise, the output is listed
4843 one per line and control characters are output as-is.
4845 Because @command{ls} is such a fundamental program, it has accumulated many
4846 options over the years. They are described in the subsections below;
4847 within each section, options are listed alphabetically (ignoring case).
4848 The division of options into the subsections is not absolute, since some
4849 options affect more than one aspect of @command{ls}'s operation.
4851 Also see @ref{Common options}.
4854 * Which files are listed::
4855 * What information is listed::
4856 * Sorting the output::
4857 * More details about version sort::
4858 * General output formatting::
4859 * Formatting file timestamps::
4860 * Formatting the file names::
4864 @node Which files are listed
4865 @subsection Which files are listed
4867 These options determine which files @command{ls} lists information for.
4868 By default, any files and the contents of any directories on the command
4877 List all files in directories, including files that start with @samp{.}.
4882 @opindex --almost-all
4883 List all files in directories except for @file{.} and @file{..}.
4886 @itemx --ignore-backups
4888 @opindex --ignore-backups
4889 @cindex backup files, ignoring
4890 Do not list files that end with @samp{~}, unless they are given on the
4896 @opindex --directory
4897 List just the names of directories, as with other types of files, rather
4898 than listing their contents.
4899 @c The following sentence is the same as the one for -F.
4900 Do not follow symbolic links listed on the
4901 command line unless the @option{--dereference-command-line} (@option{-H}),
4902 @option{--dereference} (@option{-L}), or
4903 @option{--dereference-command-line-symlink-to-dir} options are specified.
4906 @itemx --dereference-command-line
4908 @opindex --dereference-command-line
4909 @cindex symbolic links, dereferencing
4910 If a command line argument specifies a symbolic link, show information
4911 for the file the link references rather than for the link itself.
4913 @itemx --dereference-command-line-symlink-to-dir
4914 @opindex --dereference-command-line-symlink-to-dir
4915 @cindex symbolic links, dereferencing
4916 Do not dereference symbolic links, with one exception:
4917 if a command line argument specifies a symbolic link that refers to
4918 a directory, show information for that directory rather than for the
4920 This is the default behavior when no other dereferencing-related
4921 option has been specified (@option{--classify} (@option{-F}),
4922 @option{--directory} (@option{-d}),
4924 @option{--dereference} (@option{-L}), or
4925 @option{--dereference-command-line} (@option{-H})).
4928 @itemx --ignore=PATTERN
4930 @opindex --ignore=@var{pattern}
4931 Do not list files whose names match the shell pattern (not regular
4932 expression) @var{pattern} unless they are given on the command line. As
4933 in the shell, an initial @samp{.} in a file name does not match a
4934 wildcard at the start of @var{pattern}. Sometimes it is useful
4935 to give this option several times. For example,
4938 $ ls --ignore='.??*' --ignore='.[^.]' --ignore='#*'
4941 The first option ignores names of length 3 or more that start with @samp{.},
4942 the second ignores all two-character names that start with @samp{.}
4943 except @samp{..}, and the third ignores names that start with @samp{#}.
4946 @itemx --dereference
4948 @opindex --dereference
4949 @cindex symbolic links, dereferencing
4950 When showing file information for a symbolic link, show information
4951 for the file the link references rather than the link itself.
4956 @opindex --recursive
4957 @cindex recursive directory listing
4958 @cindex directory listing, recursive
4959 List the contents of all directories recursively.
4964 @node What information is listed
4965 @subsection What information is listed
4967 These options affect the information that @command{ls} displays. By
4968 default, only file names are shown.
4974 @cindex hurd, author, printing
4975 List each file's author when producing long format directory listings.
4976 In GNU/Hurd, file authors can differ from their owners, but in other
4977 operating systems the two are the same.
4983 @cindex dired Emacs mode support
4984 With the long listing (@option{-l}) format, print an additional line after
4988 //DIRED// @var{beg1} @var{end1} @var{beg2} @var{end2} @dots{}
4992 The @var{begN} and @var{endN} are unsigned integers that record the
4993 byte position of the beginning and end of each file name in the output.
4994 This makes it easy for Emacs to find the names, even when they contain
4995 unusual characters such as space or newline, without fancy searching.
4997 If directories are being listed recursively (@code{-R}), output a similar
4998 line with offsets for each subdirectory name:
5000 //SUBDIRED// @var{beg1} @var{end1} @dots{}
5003 Finally, output a line of the form:
5005 //DIRED-OPTIONS// --quoting-style=@var{word}
5007 where @var{word} is the quoting style (@pxref{Formatting the file names}).
5009 Here is an actual example:
5012 $ mkdir -p a/sub/deeper a/sub2
5014 $ touch a/sub/deeper/file
5015 $ ls -gloRF --dired a
5018 -rw-r--r-- 1 0 Nov 9 18:30 f1
5019 -rw-r--r-- 1 0 Nov 9 18:30 f2
5020 drwxr-xr-x 3 4096 Nov 9 18:30 sub/
5021 drwxr-xr-x 2 4096 Nov 9 18:30 sub2/
5025 drwxr-xr-x 2 4096 Nov 9 18:30 deeper/
5029 -rw-r--r-- 1 0 Nov 9 18:30 file
5033 //DIRED// 55 57 98 100 141 144 186 190 252 258 327 331
5034 //SUBDIRED// 2 3 195 200 263 275 335 341
5035 //DIRED-OPTIONS// --quoting-style=literal
5038 Note that the pairs of offsets on the @samp{//DIRED//} line above delimit
5039 these names: @file{f1}, @file{f2}, @file{sub}, @file{sub2}, @file{deeper},
5041 The offsets on the @samp{//SUBDIRED//} line delimit the following
5042 directory names: @file{a}, @file{a/sub}, @file{a/sub/deeper}, @file{a/sub2}.
5044 Here is an example of how to extract the fifth entry name, @samp{deeper},
5045 corresponding to the pair of offsets, 252 and 258:
5048 $ ls -gloRF --dired a > out
5049 $ dd bs=1 skip=252 count=6 < out 2>/dev/null; echo
5053 Note that although the listing above includes a trailing slash
5054 for the @samp{deeper} entry, the offsets select the name without
5055 the trailing slash. However, if you invoke @command{ls} with @option{--dired}
5056 along with an option like @option{--escape} (aka @option{-b}) and operate
5057 on a file whose name contains special characters, notice that the backslash
5062 $ ls -blog --dired 'a b'
5063 -rw-r--r-- 1 0 Nov 9 18:41 a\ b
5065 //DIRED-OPTIONS// --quoting-style=escape
5068 If you use a quoting style that adds quote marks
5069 (e.g., @option{--quoting-style=c}), then the offsets include the quote marks.
5070 So beware that the user may select the quoting style via the environment
5071 variable @env{QUOTING_STYLE}. Hence, applications using @option{--dired}
5072 should either specify an explicit @option{--quoting-style=literal} option
5073 (aka @option{-N} or @option{--literal}) on the command line, or else be
5074 prepared to parse the escaped names.
5077 @opindex --full-time
5078 Produce long format directory listings, and list times in full. It is
5079 equivalent to using @option{--format=long} with
5080 @option{--time-style=full-iso} (@pxref{Formatting file timestamps}).
5084 Produce long format directory listings, but don't display owner information.
5090 Inhibit display of group information in a long format directory listing.
5091 (This is the default in some non-@sc{gnu} versions of @command{ls}, so we
5092 provide this option for compatibility.)
5095 @itemx --human-readable
5097 @opindex --human-readable
5098 @cindex human-readable output
5099 Append a size letter to each size, such as @samp{M} for mebibytes.
5100 Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes.
5101 This option is equivalent to @option{--block-size=human} (@pxref{Block size}).
5102 Use the @option{--si} option if you prefer powers of 1000.
5108 @cindex inode number, printing
5109 Print the inode number (also called the file serial number and index
5110 number) of each file to the left of the file name. (This number
5111 uniquely identifies each file within a particular filesystem.)
5114 @itemx --format=long
5115 @itemx --format=verbose
5118 @opindex long ls @r{format}
5119 @opindex verbose ls @r{format}
5120 In addition to the name of each file, print the file type, permissions,
5121 number of hard links, owner name, group name, size, and
5122 timestamp (@pxref{Formatting file timestamps}), normally
5123 the modification time.
5125 Normally the size is printed as a byte count without punctuation, but
5126 this can be overridden (@pxref{Block size}). For example, @option{-h}
5127 prints an abbreviated, human-readable count, and
5128 @samp{--block-size="'1"} prints a byte count with the thousands
5129 separator of the current locale.
5131 For each directory that is listed, preface the files with a line
5132 @samp{total @var{blocks}}, where @var{blocks} is the total disk allocation
5133 for all files in that directory. The block size currently defaults to 1024
5134 bytes, but this can be overridden (@pxref{Block size}).
5135 The @var{blocks} computed counts each hard link separately;
5136 this is arguably a deficiency.
5138 @cindex permissions, output by @command{ls}
5139 The permissions listed are similar to symbolic mode specifications
5140 (@pxref{Symbolic Modes}). But @command{ls} combines multiple bits into the
5141 third character of each set of permissions as follows:
5144 If the setuid or setgid bit and the corresponding executable bit
5148 If the setuid or setgid bit is set but the corresponding executable bit
5152 If the sticky bit and the other-executable bit are both set.
5155 If the sticky bit is set but the other-executable bit is not set.
5158 If the executable bit is set and none of the above apply.
5164 Following the permission bits is a single character that specifies
5165 whether an alternate access method applies to the file. When that
5166 character is a space, there is no alternate access method. When it
5167 is a printing character (e.g., @samp{+}), then there is such a method.
5170 @itemx --numeric-uid-gid
5172 @opindex --numeric-uid-gid
5173 @cindex numeric uid and gid
5174 Produce long format directory listings, but
5175 display numeric UIDs and GIDs instead of the owner and group names.
5179 Produce long format directory listings, but don't display group information.
5180 It is equivalent to using @option{--format=long} with @option{--no-group} .
5186 @cindex disk allocation
5187 @cindex size of files, reporting
5188 Print the disk allocation of each file to the left of the file name.
5189 This is the amount of disk space used by the file, which is usually a
5190 bit more than the file's size, but it can be less if the file has holes.
5192 Normally the disk allocation is printed in units of
5193 1024 bytes, but this can be overridden (@pxref{Block size}).
5195 @cindex NFS mounts from BSD to HP-UX
5196 For files that are NFS-mounted from an HP-UX system to a BSD system,
5197 this option reports sizes that are half the correct values. On HP-UX
5198 systems, it reports sizes that are twice the correct values for files
5199 that are NFS-mounted from BSD systems. This is due to a flaw in HP-UX;
5200 it also affects the HP-UX @command{ls} program.
5205 Append an SI-style abbreviation to each size, such as @samp{MB} for
5206 megabytes. Powers of 1000 are used, not 1024; @samp{MB} stands for
5207 1,000,000 bytes. This option is equivalent to
5208 @option{--block-size=si}. Use the @option{-h} or
5209 @option{--human-readable} option if
5210 you prefer powers of 1024.
5215 @node Sorting the output
5216 @subsection Sorting the output
5218 @cindex sorting @command{ls} output
5219 These options change the order in which @command{ls} sorts the information
5220 it outputs. By default, sorting is done by character code
5221 (e.g., @acronym{ASCII} order).
5227 @itemx --time=status
5231 @opindex ctime@r{, printing or sorting by}
5232 @opindex status time@r{, printing or sorting by}
5233 @opindex use time@r{, printing or sorting files by}
5234 If the long listing format (e.g., @option{-l}, @option{-o}) is being used,
5235 print the status change time (the @samp{ctime} in the inode) instead of
5236 the modification time.
5237 When explicitly sorting by time (@option{--sort=time} or @option{-t})
5238 or when not using a long listing format,
5239 sort according to the status change time.
5243 @cindex unsorted directory listing
5244 @cindex directory order, listing by
5245 Primarily, like @option{-U}---do not sort; list the files in whatever
5246 order they are stored in the directory. But also enable @option{-a} (list
5247 all files) and disable @option{-l}, @option{--color}, and @option{-s} (if they
5248 were specified before the @option{-f}).
5254 @cindex reverse sorting
5255 Reverse whatever the sorting method is---e.g., list files in reverse
5256 alphabetical order, youngest first, smallest first, or whatever.
5262 @opindex size of files@r{, sorting files by}
5263 Sort by file size, largest first.
5269 @opindex modification time@r{, sorting files by}
5270 Sort by modification time (the @samp{mtime} in the inode), newest first.
5274 @itemx --time=access
5277 @opindex use time@r{, printing or sorting files by}
5278 @opindex atime@r{, printing or sorting files by}
5279 @opindex access time@r{, printing or sorting files by}
5280 If the long listing format (e.g., @option{--format=long}) is being used,
5281 print the last access time (the @samp{atime} in the inode).
5282 When explicitly sorting by time (@option{--sort=time} or @option{-t})
5283 or when not using a long listing format, sort according to the access time.
5289 @opindex none@r{, sorting option for @command{ls}}
5290 Do not sort; list the files in whatever order they are
5291 stored in the directory. (Do not do any of the other unrelated things
5292 that @option{-f} does.) This is especially useful when listing very large
5293 directories, since not doing any sorting can be noticeably faster.
5296 @itemx --sort=version
5299 @opindex version@r{, sorting option for @command{ls}}
5300 Sort by version name and number, lowest first. It behaves like a default
5301 sort, except that each sequence of decimal digits is treated numerically
5302 as an index/version number. (@xref{More details about version sort}.)
5305 @itemx --sort=extension
5308 @opindex extension@r{, sorting files by}
5309 Sort directory contents alphabetically by file extension (characters
5310 after the last @samp{.}); files with no extension are sorted first.
5315 @node More details about version sort
5316 @subsection More details about version sort
5318 The version sort takes into account the fact that file names frequently include
5319 indices or version numbers. Standard sorting functions usually do not produce
5320 the ordering that people expect because comparisons are made on a
5321 character-by-character basis. The version
5322 sort addresses this problem, and is especially useful when browsing
5323 directories that contain many files with indices/version numbers in their
5328 foo.zml-1.gz foo.zml-1.gz
5329 foo.zml-100.gz foo.zml-2.gz
5330 foo.zml-12.gz foo.zml-6.gz
5331 foo.zml-13.gz foo.zml-12.gz
5332 foo.zml-2.gz foo.zml-13.gz
5333 foo.zml-25.gz foo.zml-25.gz
5334 foo.zml-6.gz foo.zml-100.gz
5337 Note also that numeric parts with leading zeroes are considered as
5342 abc-1.007.tgz abc-1.007.tgz
5343 abc-1.012b.tgz abc-1.01a.tgz
5344 abc-1.01a.tgz abc-1.012b.tgz
5347 @node General output formatting
5348 @subsection General output formatting
5350 These options affect the appearance of the overall output.
5355 @itemx --format=single-column
5358 @opindex single-column @r{output of files}
5359 List one file per line. This is the default for @command{ls} when standard
5360 output is not a terminal.
5363 @itemx --format=vertical
5366 @opindex vertical @r{sorted files in columns}
5367 List files in columns, sorted vertically. This is the default for
5368 @command{ls} if standard output is a terminal. It is always the default
5369 for the @command{dir} and @code{d} programs.
5370 @sc{gnu} @command{ls} uses variable width columns to display as many files as
5371 possible in the fewest lines.
5373 @item --color [=@var{when}]
5375 @cindex color, distinguishing file types with
5376 Specify whether to use color for distinguishing file types. @var{when}
5377 may be omitted, or one of:
5380 @vindex none @r{color option}
5381 - Do not use color at all. This is the default.
5383 @vindex auto @r{color option}
5384 @cindex terminal, using color iff
5385 - Only use color if standard output is a terminal.
5387 @vindex always @r{color option}
5390 Specifying @option{--color} and no @var{when} is equivalent to
5391 @option{--color=always}.
5392 Piping a colorized listing through a pager like @code{more} or
5393 @code{less} usually produces unreadable results. However, using
5394 @code{more -f} does seem to work.
5398 @itemx --indicator-style=classify
5401 @opindex --indicator-style
5402 @cindex file type and executables, marking
5403 @cindex executables and file type, marking
5404 Append a character to each file name indicating the file type. Also,
5405 for regular files that are executable, append @samp{*}. The file type
5406 indicators are @samp{/} for directories, @samp{@@} for symbolic links,
5407 @samp{|} for FIFOs, @samp{=} for sockets, and nothing for regular files.
5408 @c The following sentence is the same as the one for -d.
5409 Do not follow symbolic links listed on the
5410 command line unless the @option{--dereference-command-line} (@option{-H}),
5411 @option{--dereference} (@option{-L}), or
5412 @option{--dereference-command-line-symlink-to-dir} options are specified.
5414 @item --indicator-style=@var{word}
5415 @opindex --indicator-style
5416 Append a character indicator with style @var{word} to entry names,
5420 Do not append any character indicator; this is the default.
5422 Append @samp{/} for directories, @samp{@@} for symbolic links, @samp{|}
5423 for FIFOs, @samp{=} for sockets, and nothing for regular files. This is
5424 the same as the @option{-p} or @option{--file-type} option.
5426 Append @samp{*} for executable regular files, otherwise behave as for
5427 @samp{file-type}. This is the same as the @option{-F} or
5428 @option{--classify} option.
5433 Print file sizes in 1024-byte blocks, overriding the default block
5434 size (@pxref{Block size}).
5435 This option is equivalent to @option{--block-size=1K}.
5438 @itemx --format=commas
5441 @opindex commas@r{, outputting between files}
5442 List files horizontally, with as many as will fit on each line,
5443 separated by @samp{, } (a comma and a space).
5447 @itemx --indicator-style=file-type
5448 @opindex --file-type
5449 @opindex --indicator-style
5450 @cindex file type, marking
5451 Append a character to each file name indicating the file type. This is
5452 like @option{-F}, except that executables are not marked.
5454 @item -x @var{format}
5455 @itemx --format=across
5456 @itemx --format=horizontal
5459 @opindex across@r{, listing files}
5460 @opindex horizontal@r{, listing files}
5461 List the files in columns, sorted horizontally.
5464 @itemx --tabsize=@var{cols}
5467 Assume that each tabstop is @var{cols} columns wide. The default is 8.
5468 @command{ls} uses tabs where possible in the output, for efficiency. If
5469 @var{cols} is zero, do not use tabs at all.
5472 @itemx --width=@var{cols}
5476 Assume the screen is @var{cols} columns wide. The default is taken
5477 from the terminal settings if possible; otherwise the environment
5478 variable @env{COLUMNS} is used if it is set; otherwise the default
5484 @node Formatting file timestamps
5485 @subsection Formatting file timestamps
5487 By default, file timestamps are listed in abbreviated form. Most
5488 locales use a timestamp like @samp{2002-03-30 23:45}. However, the
5489 default @acronym{POSIX} locale uses a date like @samp{Mar 30@ @ 2002}
5490 for non-recent timestamps, and a date-without-year and time like
5491 @samp{Mar 30 23:45} for recent timestamps.
5493 A timestamp is considered to be @dfn{recent} if it is less than six
5494 months old, and is not dated in the future. If a timestamp dated
5495 today is not listed in recent form, the timestamp is in the future,
5496 which means you probably have clock skew problems which may break
5497 programs like @command{make} that rely on file timestamps.
5499 The following option changes how file timestamps are printed.
5502 @item --time-style=@var{style}
5503 @opindex --time-style
5505 List timestamps in style @var{style}. The @var{style} should
5506 be one of the following:
5511 List timestamps using @var{format}, where @var{format} is interpreted
5512 like the format argument of @command{date} (@pxref{date invocation}).
5513 For example, @option{--time-style="+%Y-%m-%d %H:%M:%S"} causes
5514 @command{ls} to list timestamps like @samp{2002-03-30 23:45:56}. As
5515 with @command{date}, @var{format}'s interpretation is affected by the
5516 @env{LC_TIME} locale category.
5518 If @var{format} contains two format strings separated by a newline,
5519 the former is used for non-recent files and the latter for recent
5520 files; if you want output columns to line up, you may need to insert
5521 spaces in one of the two formats.
5524 List timestamps in full using @acronym{ISO} 8601 date, time, and time zone
5525 format with nanosecond precision, e.g., @samp{2002-03-30
5526 23:45:56.477817180 -0700}. This style is equivalent to
5527 @samp{+%Y-%m-%d %H:%M:%S.%N %z}.
5529 This is useful because the time output includes all the information that
5530 is available from the operating system. For example, this can help
5531 explain @command{make}'s behavior, since @acronym{GNU} @command{make}
5532 uses the full timestamp to determine whether a file is out of date.
5535 List @acronym{ISO} 8601 date and time in minutes, e.g.,
5536 @samp{2002-03-30 23:45}. These timestamps are shorter than
5537 @samp{full-iso} timestamps, and are usually good enough for everyday
5538 work. This style is equivalent to @samp{%Y-%m-%d %H:%M}.
5541 List @acronym{ISO} 8601 dates for non-recent timestamps (e.g.,
5542 @samp{2002-03-30@ }), and @acronym{ISO} 8601 month, day, hour, and
5543 minute for recent timestamps (e.g., @samp{03-30 23:45}). These
5544 timestamps are uglier than @samp{long-iso} timestamps, but they carry
5545 nearly the same information in a smaller space and their brevity helps
5546 @command{ls} output fit within traditional 80-column output lines.
5547 The following two @command{ls} invocations are equivalent:
5552 ls -l --time-style="+%Y-%m-%d $newline%m-%d %H:%M"
5553 ls -l --time-style="iso"
5558 List timestamps in a locale-dependent form. For example, a Finnish
5559 locale might list non-recent timestamps like @samp{maalis 30@ @ 2002}
5560 and recent timestamps like @samp{maalis 30 23:45}. Locale-dependent
5561 timestamps typically consume more space than @samp{iso} timestamps and
5562 are harder for programs to parse because locale conventions vary so
5563 widely, but they are easier for many people to read.
5565 The @env{LC_TIME} locale category specifies the timestamp format. The
5566 default @acronym{POSIX} locale uses timestamps like @samp{Mar 30@
5567 @ 2002} and @samp{Mar 30 23:45}; in this locale, the following two
5568 @command{ls} invocations are equivalent:
5573 ls -l --time-style="+%b %e %Y$newline%b %e %H:%M"
5574 ls -l --time-style="locale"
5577 Other locales behave differently. For example, in a German locale,
5578 @option{--time-style="locale"} might be equivalent to
5579 @option{--time-style="+%e. %b %Y $newline%e. %b %H:%M"}
5580 and might generate timestamps like @samp{30. M@"ar 2002@ } and
5581 @samp{30. M@"ar 23:45}.
5583 @item posix-@var{style}
5585 List @acronym{POSIX}-locale timestamps if the @env{LC_TIME} locale
5586 category is @acronym{POSIX}, @var{style} timestamps otherwise. For
5587 example, the default style, which is @samp{posix-long-iso}, lists
5588 timestamps like @samp{Mar 30@ @ 2002} and @samp{Mar 30 23:45} when in
5589 the @acronym{POSIX} locale, and like @samp{2002-03-30 23:45} otherwise.
5594 You can specify the default value of the @option{--time-style} option
5595 with the environment variable @env{TIME_STYLE}; if @env{TIME_STYLE} is not set
5596 the default style is @samp{posix-long-iso}. @acronym{GNU} Emacs 21 and
5597 later can parse @acronym{ISO} dates, but older Emacs versions do not, so if
5598 you are using an older version of Emacs and specify a non-@acronym{POSIX}
5599 locale, you may need to set @samp{TIME_STYLE="locale"}.
5602 @node Formatting the file names
5603 @subsection Formatting the file names
5605 These options change how file names themselves are printed.
5611 @itemx --quoting-style=escape
5614 @opindex --quoting-style
5615 @cindex backslash sequences for file names
5616 Quote nongraphic characters in file names using alphabetic and octal
5617 backslash sequences like those used in C.
5621 @itemx --quoting-style=literal
5624 @opindex --quoting-style
5625 Do not quote file names.
5628 @itemx --hide-control-chars
5630 @opindex --hide-control-chars
5631 Print question marks instead of nongraphic characters in file names.
5632 This is the default if the output is a terminal and the program is
5637 @itemx --quoting-style=c
5639 @opindex --quote-name
5640 @opindex --quoting-style
5641 Enclose file names in double quotes and quote nongraphic characters as
5644 @item --quoting-style=@var{word}
5645 @opindex --quoting-style
5646 @cindex quoting style
5647 Use style @var{word} to quote output names. The @var{word} should
5648 be one of the following:
5651 Output names as-is; this is the same as the @option{-N} or
5652 @option{--literal} option.
5654 Quote names for the shell if they contain shell metacharacters or would
5655 cause ambiguous output.
5657 Quote names for the shell, even if they would normally not require quoting.
5659 Quote names as for a C language string; this is the same as the
5660 @option{-Q} or @option{--quote-name} option.
5662 Quote as with @samp{c} except omit the surrounding double-quote
5663 characters; this is the same as the @option{-b} or @option{--escape} option.
5665 Quote as with @samp{c} except use quotation marks appropriate for the
5668 @c Use @t instead of @samp to avoid duplicate quoting in some output styles.
5669 Like @samp{clocale}, but quote @t{`like this'} instead of @t{"like
5670 this"} in the default C locale. This looks nicer on many displays.
5673 You can specify the default value of the @option{--quoting-style} option
5674 with the environment variable @env{QUOTING_STYLE}. If that environment
5675 variable is not set, the default value is @samp{literal}, but this
5676 default may change to @samp{shell} in a future version of this package.
5678 @item --show-control-chars
5679 @opindex --show-control-chars
5680 Print nongraphic characters as-is in file names.
5681 This is the default unless the output is a terminal and the program is
5687 @node dir invocation
5688 @section @command{dir}: Briefly list directory contents
5691 @cindex directory listing, brief
5693 @command{dir} (also installed as @code{d}) is equivalent to @code{ls -C
5694 -b}; that is, by default files are listed in columns, sorted vertically,
5695 and special characters are represented by backslash escape sequences.
5697 @xref{ls invocation, @command{ls}}.
5700 @node vdir invocation
5701 @section @command{vdir}: Verbosely list directory contents
5704 @cindex directory listing, verbose
5706 @command{vdir} (also installed as @code{v}) is equivalent to @code{ls -l
5707 -b}; that is, by default files are listed in long format and special
5708 characters are represented by backslash escape sequences.
5710 @node dircolors invocation
5711 @section @command{dircolors}: Color setup for @code{ls}
5715 @cindex setup for color
5717 @command{dircolors} outputs a sequence of shell commands to set up the
5718 terminal for color output from @command{ls} (and @code{dir}, etc.).
5722 eval `dircolors [@var{option}]@dots{} [@var{file}]`
5725 If @var{file} is specified, @command{dircolors} reads it to determine which
5726 colors to use for which file types and extensions. Otherwise, a
5727 precompiled database is used. For details on the format of these files,
5728 run @samp{dircolors --print-database}.
5731 @vindex SHELL @r{environment variable, and color}
5732 The output is a shell command to set the @env{LS_COLORS} environment
5733 variable. You can specify the shell syntax to use on the command line,
5734 or @command{dircolors} will guess it from the value of the @env{SHELL}
5735 environment variable.
5737 The program accepts the following options. Also see @ref{Common options}.
5742 @itemx --bourne-shell
5745 @opindex --bourne-shell
5746 @cindex Bourne shell syntax for color setup
5747 @cindex @code{sh} syntax for color setup
5748 Output Bourne shell commands. This is the default if the @env{SHELL}
5749 environment variable is set and does not end with @samp{csh} or
5758 @cindex C shell syntax for color setup
5759 @cindex @code{csh} syntax for color setup
5760 Output C shell commands. This is the default if @code{SHELL} ends with
5761 @code{csh} or @code{tcsh}.
5764 @itemx --print-database
5766 @opindex --print-database
5767 @cindex color database, printing
5768 @cindex database for color setup, printing
5769 @cindex printing color database
5770 Print the (compiled-in) default color configuration database. This
5771 output is itself a valid configuration file, and is fairly descriptive
5772 of the possibilities.
5777 @node Basic operations
5778 @chapter Basic operations
5780 @cindex manipulating files
5782 This chapter describes the commands for basic file manipulation:
5783 copying, moving (renaming), and deleting (removing).
5786 * cp invocation:: Copy files.
5787 * dd invocation:: Convert and copy a file.
5788 * install invocation:: Copy files and set attributes.
5789 * mv invocation:: Move (rename) files.
5790 * rm invocation:: Remove files or directories.
5791 * shred invocation:: Remove files more securely.
5796 @section @command{cp}: Copy files and directories
5799 @cindex copying files and directories
5800 @cindex files, copying
5801 @cindex directories, copying
5803 @command{cp} copies files (or, optionally, directories). The copy is
5804 completely independent of the original. You can either copy one file to
5805 another, or copy arbitrarily many files to a destination directory.
5809 cp [@var{option}]@dots{} @var{source} @var{dest}
5810 cp [@var{option}]@dots{} @var{source}@dots{} @var{directory}
5813 If the last argument names an existing directory, @command{cp} copies each
5814 @var{source} file into that directory (retaining the same name).
5815 Otherwise, if only two files are given, it copies the first onto the
5816 second. It is an error if the last argument is not a directory and more
5817 than two non-option arguments are given.
5819 Generally, files are written just as they are read. For exceptions,
5820 see the @option{--sparse} option below.
5822 By default, @command{cp} does not copy directories. However, the
5823 @option{-R}, @option{-a}, and @option{-r} options cause @command{cp} to
5824 copy recursively by descending into source directories and copying files
5825 to corresponding destination directories.
5827 By default, @command{cp} follows symbolic links only when not copying
5828 recursively. This default can be overridden with the
5829 @option{--archive} (@option{-a}), @option{-d}, @option{--dereference}
5830 (@option{-L}), @option{--no-dereference} (@option{-P}), and
5831 @option{-H} options. If more than one of these options is specified,
5832 the last one silently overrides the others.
5834 By default, @command{cp} copies the contents of special files only
5835 when not copying recursively. This default can be overridden with the
5836 @option{--copy-contents} option.
5838 @cindex self-backups
5839 @cindex backups, making only
5840 @command{cp} generally refuses to copy a file onto itself, with the
5841 following exception: if @option{--force --backup} is specified with
5842 @var{source} and @var{dest} identical, and referring to a regular file,
5843 @command{cp} will make a backup file, either regular or numbered, as
5844 specified in the usual ways (@pxref{Backup options}). This is useful when
5845 you simply want to make a backup of an existing file before changing it.
5847 The program accepts the following options. Also see @ref{Common options}.
5854 Preserve as much as possible of the structure and attributes of the
5855 original files in the copy (but do not attempt to preserve internal
5856 directory structure; i.e., @samp{ls -U} may list the entries in a copied
5857 directory in a different order).
5858 Equivalent to @option{-dpPR}.
5861 @itemx @w{@kbd{--backup}[=@var{method}]}
5864 @vindex VERSION_CONTROL
5865 @cindex backups, making
5866 @xref{Backup options}.
5867 Make a backup of each file that would otherwise be overwritten or removed.
5868 As a special case, @command{cp} makes a backup of @var{source} when the force
5869 and backup options are given and @var{source} and @var{dest} are the same
5870 name for an existing, regular file. One useful application of this
5871 combination of options is this tiny Bourne shell script:
5875 # Usage: backup FILE...
5876 # Create a @sc{gnu}-style backup of each listed FILE.
5878 cp --backup --force "$i" "$i"
5882 @item --copy-contents
5883 @cindex directories, copying recursively
5884 @cindex copying directories recursively
5885 @cindex recursively copying directories
5886 @cindex non-directories, copying as special files
5887 If copying recursively, copy the contents of any special files (e.g.,
5888 FIFOs and device files) as if they were regular files. This means
5889 trying to read the data in each source file and writing it to the
5890 destination. It is usually a mistake to use this option, as it
5891 normally has undesirable effects on special files like FIFOs and the
5892 ones typically found in the @file{/dev} directory. In most cases,
5893 @code{cp -R --copy-contents} will hang indefinitely trying to read
5894 from FIFOs and special files like @file{/dev/console}, and it will
5895 fill up your destination disk if you use it to copy @file{/dev/zero}.
5896 This option has no effect unless copying recursively, and it does not
5897 affect the copying of symbolic links.
5901 @cindex symbolic links, copying
5902 @cindex hard links, preserving
5903 Copy symbolic links as symbolic links rather than copying the files that
5904 they point to, and preserve hard links between source files in the copies.
5905 Equivalent to @option{--no-dereference --preserve=links}.
5911 When copying without this option and an existing destination file cannot
5912 be opened for writing, the copy fails. However, with @option{--force}),
5913 when a destination file cannot be opened, @command{cp} then unlinks it and
5914 tries to open it again. Contrast this behavior with that enabled by
5915 @option{--link} and @option{--symbolic-link}, whereby the destination file
5916 is never opened but rather is unlinked unconditionally. Also see the
5917 description of @option{--remove-destination}.
5921 If a command line argument specifies a symbolic link, then copy the
5922 file it points to rather than the symbolic link itself. However,
5923 copy (preserving its nature) any symbolic link that is encountered
5924 via recursive traversal.
5927 @itemx --interactive
5929 @opindex --interactive
5930 Prompt whether to overwrite existing regular destination files.
5936 Make hard links instead of copies of non-directories.
5939 @itemx --dereference
5941 @opindex --dereference
5942 Always follow symbolic links.
5945 @itemx --no-dereference
5947 @opindex --no-dereference
5948 @cindex symbolic links, copying
5949 Copy symbolic links as symbolic links rather than copying the files that
5953 @itemx @w{@kbd{--preserve}[=@var{attribute_list}]}
5956 @cindex file information, preserving
5957 Preserve the specified attributes of the original files.
5958 If specified, the @var{attribute_list} must be a comma-separated list
5959 of one or more of the following strings:
5963 Preserve the permission attributes.
5965 Preserve the owner and group. On most modern systems,
5966 only the super-user may change the owner of a file, and regular users
5967 may preserve the group ownership of a file only if they happen to be
5968 a member of the desired group.
5970 Preserve the times of last access and last modification.
5972 Preserve in the destination files
5973 any links between corresponding source files.
5974 @c Give examples illustrating how hard links are preserved.
5975 @c Also, show how soft links map to hard links with -L and -H.
5977 Preserve all file attributes.
5978 Equivalent to specifying all of the above.
5979 @c Mention ACLs here.
5982 Using @option{--preserve} with no @var{attribute_list} is equivalent
5983 to @option{--preserve=mode,ownership,timestamps}.
5985 In the absence of this option, each destination file is created with the
5986 permissions of the corresponding source file, minus the bits set in the
5987 umask and minus the set-user-id and set-group-id bits. @xref{File permissions}.
5989 @itemx @w{@kbd{--no-preserve}=@var{attribute_list}}
5990 @cindex file information, preserving
5991 Do not preserve the specified attributes. The @var{attribute_list}
5992 has the same form as for @option{--preserve}.
5996 @cindex parent directories and @command{cp}
5997 Form the name of each destination file by appending to the target
5998 directory a slash and the specified name of the source file. The last
5999 argument given to @command{cp} must be the name of an existing directory.
6000 For example, the command:
6003 cp --parents a/b/c existing_dir
6007 copies the file @file{a/b/c} to @file{existing_dir/a/b/c}, creating
6008 any missing intermediate directories.
6010 @itemx @w{@kbd{--reply}[=@var{how}]}
6012 @cindex interactivity
6013 Using @option{--reply=yes} makes @command{cp} act as if @samp{yes} were
6014 given as a response to every prompt about a destination file. That effectively
6015 cancels any preceding @option{--interactive} or @option{-i} option.
6016 Specify @option{--reply=no} to make @command{cp} act as if @samp{no} were
6017 given as a response to every prompt about a destination file.
6018 Specify @option{--reply=query} to make @command{cp} prompt the user
6019 about each existing destination file.
6026 @opindex --recursive
6027 @cindex directories, copying recursively
6028 @cindex copying directories recursively
6029 @cindex recursively copying directories
6030 @cindex non-directories, copying as special files
6031 Copy directories recursively. Symbolic links are not followed by
6032 default; see the @option{--archive} (@option{-a}), @option{-d},
6033 @option{--dereference} (@option{-L}), @option{--no-dereference}
6034 (@option{-P}), and @option{-H} options. Special files are copied by
6035 creating a destination file of the same type as the source; see the
6036 @option{--copy-contents} option. It is not portable to use
6037 @option{-r} to copy symbolic links or special files. On some
6038 non-@sc{gnu} systems, @option{-r} implies the equivalent of
6039 @option{-L} and @option{--copy-contents} for historical reasons.
6040 Also, it is not portable to use @option{-R} to copy symbolic links
6041 unless you also specify @option{-P}, as @acronym{POSIX} allows
6042 implementations that dereference symbolic links by default.
6044 @item --remove-destination
6045 @opindex --remove-destination
6046 Remove each existing destination file before attempting to open it
6047 (contrast with @option{-f} above).
6049 @item --sparse=@var{when}
6050 @opindex --sparse=@var{when}
6051 @cindex sparse files, copying
6052 @cindex holes, copying files with
6053 @findex read @r{system call, and holes}
6054 A @dfn{sparse file} contains @dfn{holes}---a sequence of zero bytes that
6055 does not occupy any physical disk blocks; the @samp{read} system call
6056 reads these as zeroes. This can both save considerable disk space and
6057 increase speed, since many binary files contain lots of consecutive zero
6058 bytes. By default, @command{cp} detects holes in input source files via a crude
6059 heuristic and makes the corresponding output file sparse as well.
6061 The @var{when} value can be one of the following:
6064 The default behavior: the output file is sparse if the input file is sparse.
6067 Always make the output file sparse. This is useful when the input
6068 file resides on a filesystem that does not support sparse files (the
6069 most notable example is @samp{efs} filesystems in SGI IRIX 5.3 and
6070 earlier), but the output file is on another type of filesystem.
6073 Never make the output file sparse.
6074 This is useful in creating a file for use with the @code{mkswap} command,
6075 since such a file must not have any holes.
6078 @itemx @w{@kbd{--strip-trailing-slashes}}
6079 @opindex --strip-trailing-slashes
6080 @cindex stripping trailing slashes
6081 Remove any trailing slashes from each @var{source} argument.
6082 @xref{Trailing slashes}.
6085 @itemx --symbolic-link
6087 @opindex --symbolic-link
6088 @cindex symbolic links, copying with
6089 Make symbolic links instead of copies of non-directories. All source
6090 file names must be absolute (starting with @samp{/}) unless the
6091 destination files are in the current directory. This option merely
6092 results in an error message on systems that do not support symbolic links.
6094 @item -S @var{suffix}
6095 @itemx --suffix=@var{suffix}
6098 Append @var{suffix} to each backup file made with @option{-b}.
6099 @xref{Backup options}.
6101 @itemx @w{@kbd{--target-directory}=@var{directory}}
6102 @opindex --target-directory
6103 @cindex target directory
6104 @cindex destination directory
6105 Specify the destination @var{directory}.
6106 @xref{Target directory}.
6112 Print the name of each file before copying it.
6114 @item -V @var{method}
6115 @itemx --version-control=@var{method}
6117 @opindex --version-control
6118 Change the type of backups made with @option{-b}. The @var{method}
6119 argument can be @samp{none} (or @samp{off}), @samp{numbered} (or
6120 @samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or
6121 @samp{simple}). @xref{Backup options}.
6124 @itemx --one-file-system
6126 @opindex --one-file-system
6127 @cindex filesystems, omitting copying to different
6128 Skip subdirectories that are on different filesystems from the one that
6129 the copy started on.
6130 However, mount point directories @emph{are} copied.
6136 @section @command{dd}: Convert and copy a file
6139 @cindex converting while copying a file
6141 @command{dd} copies a file (from standard input to standard output, by
6142 default) with a changeable I/O block size, while optionally performing
6143 conversions on it. Synopsis:
6146 dd [@var{option}]@dots{}
6149 The program accepts the following options. Also see @ref{Common options}.
6151 @cindex multipliers after numbers
6152 The numeric-valued options below (@var{bytes} and @var{blocks}) can be
6153 followed by a multiplier: @samp{b}=512, @samp{c}=1,
6154 @samp{w}=2, @samp{x@var{m}}=@var{m}, or any of the
6155 standard block size suffixes like @samp{k}=1024 (@pxref{Block size}).
6157 Use different @command{dd} invocations to use different block sizes for
6158 skipping and I/O. For example, the following shell commands copy data
6159 in 512 KiB blocks between a disk and a tape, but do not save or restore a
6160 4 KiB label at the start of the disk:
6163 disk=/dev/rdsk/c0t1d0s2
6166 # Copy all but the label from disk to tape.
6167 (dd bs=4k skip=1 count=0 && dd bs=512k) <$disk >$tape
6169 # Copy from tape back to disk, but leave the disk label alone.
6170 (dd bs=4k seek=1 count=0 && dd bs=512k) <$tape >$disk
6177 Read from @var{file} instead of standard input.
6181 Write to @var{file} instead of standard output. Unless
6182 @samp{conv=notrunc} is given, @command{dd} truncates @var{file} to zero
6183 bytes (or the size specified with @samp{seek=}).
6185 @item ibs=@var{bytes}
6187 @cindex block size of input
6188 @cindex input block size
6189 Read @var{bytes} bytes at a time.
6191 @item obs=@var{bytes}
6193 @cindex block size of output
6194 @cindex output block size
6195 Write @var{bytes} bytes at a time.
6197 @item bs=@var{bytes}
6200 Both read and write @var{bytes} bytes at a time. This overrides
6201 @samp{ibs} and @samp{obs}.
6203 @item cbs=@var{bytes}
6205 @cindex block size of conversion
6206 @cindex conversion block size
6207 Convert @var{bytes} bytes at a time.
6209 @item skip=@var{blocks}
6211 Skip @var{blocks} @samp{ibs}-byte blocks in the input file before copying.
6213 @item seek=@var{blocks}
6215 Skip @var{blocks} @samp{obs}-byte blocks in the output file before copying.
6217 @item count=@var{blocks}
6219 Copy @var{blocks} @samp{ibs}-byte blocks from the input file, instead
6220 of everything until the end of the file.
6222 @item conv=@var{conversion}[,@var{conversion}]@dots{}
6224 Convert the file as specified by the @var{conversion} argument(s).
6225 (No spaces around any comma(s).)
6232 @opindex ascii@r{, converting to}
6233 Convert @acronym{EBCDIC} to @acronym{ASCII}.
6236 @opindex ebcdic@r{, converting to}
6237 Convert @acronym{ASCII} to @acronym{EBCDIC}.
6240 @opindex alternate ebcdic@r{, converting to}
6241 Convert @acronym{ASCII} to alternate @acronym{EBCDIC}.
6244 @opindex block @r{(space-padding)}
6245 For each line in the input, output @samp{cbs} bytes, replacing the
6246 input newline with a space and padding with spaces as necessary.
6250 Replace trailing spaces in each @samp{cbs}-sized input block with a
6254 @opindex lcase@r{, converting to}
6255 Change uppercase letters to lowercase.
6258 @opindex ucase@r{, converting to}
6259 Change lowercase letters to uppercase.
6262 @opindex swab @r{(byte-swapping)}
6263 @cindex byte-swapping
6264 Swap every pair of input bytes. @sc{gnu} @command{dd}, unlike others, works
6265 when an odd number of bytes are read---the last byte is simply copied
6266 (since there is nothing to swap it with).
6270 @cindex read errors, ignoring
6271 Continue after read errors.
6275 @cindex truncating output file, avoiding
6276 Do not truncate the output file.
6279 @opindex sync @r{(padding with nulls)}
6280 Pad every input block to size of @samp{ibs} with trailing zero bytes.
6281 When used with @samp{block} or @samp{unblock}, pad with spaces instead of
6288 @node install invocation
6289 @section @command{install}: Copy files and set attributes
6292 @cindex copying files and setting attributes
6294 @command{install} copies files while setting their permission modes and, if
6295 possible, their owner and group. Synopses:
6298 install [@var{option}]@dots{} @var{source} @var{dest}
6299 install [@var{option}]@dots{} @var{source}@dots{} @var{directory}
6300 install -d [@var{option}]@dots{} @var{directory}@dots{}
6303 In the first of these, the @var{source} file is copied to the @var{dest}
6304 target file. In the second, each of the @var{source} files are copied
6305 to the destination @var{directory}. In the last, each @var{directory}
6306 (and any missing parent directories) is created.
6308 @cindex Makefiles, installing programs in
6309 @command{install} is similar to @code{cp}, but allows you to control the
6310 attributes of destination files. It is typically used in Makefiles to
6311 copy programs into their destination directories. It refuses to copy
6312 files onto themselves.
6314 The program accepts the following options. Also see @ref{Common options}.
6319 @itemx @w{@kbd{--backup}[=@var{method}]}
6322 @vindex VERSION_CONTROL
6323 @cindex backups, making
6324 @xref{Backup options}.
6325 Make a backup of each file that would otherwise be overwritten or removed.
6329 Ignored; for compatibility with old Unix versions of @command{install}.
6334 @opindex --directory
6335 @cindex directories, creating with given attributes
6336 @cindex parent directories, creating missing
6337 @cindex leading directories, creating missing
6338 Create each given directory and any missing parent directories, setting
6339 the owner, group and mode as given on the command line or to the
6340 defaults. It also gives any parent directories it creates those
6341 attributes. (This is different from the SunOS 4.x @command{install}, which
6342 gives directories that it creates the default attributes.)
6344 @item -g @var{group}
6345 @itemx --group=@var{group}
6348 @cindex group ownership of installed files, setting
6349 Set the group ownership of installed files or directories to
6350 @var{group}. The default is the process' current group. @var{group}
6351 may be either a group name or a numeric group id.
6354 @itemx --mode=@var{mode}
6357 @cindex permissions of installed files, setting
6358 Set the permissions for the installed file or directory to @var{mode},
6359 which can be either an octal number, or a symbolic mode as in
6360 @command{chmod}, with 0 as the point of departure (@pxref{File
6361 permissions}). The default mode is @samp{u=rwx,go=rx}---read, write,
6362 and execute for the owner, and read and execute for group and other.
6364 @item -o @var{owner}
6365 @itemx --owner=@var{owner}
6368 @cindex ownership of installed files, setting
6369 @cindex appropriate privileges
6370 @vindex root @r{as default owner}
6371 If @command{install} has appropriate privileges (is run as root), set the
6372 ownership of installed files or directories to @var{owner}. The default
6373 is @code{root}. @var{owner} may be either a user name or a numeric user
6377 @itemx --preserve-timestamps
6379 @opindex --preserve-timestamps
6380 @cindex timestamps of installed files, preserving
6381 Set the time of last access and the time of last modification of each
6382 installed file to match those of each corresponding original file.
6383 When a file is installed without this option, its last access and
6384 last modification times are both set to the time of installation.
6385 This option is useful if you want to use the last modification times
6386 of installed files to keep track of when they were last built as opposed
6387 to when they were last installed.
6393 @cindex symbol table information, stripping
6394 @cindex stripping symbol table information
6395 Strip the symbol tables from installed binary executables.
6397 @item -S @var{suffix}
6398 @itemx --suffix=@var{suffix}
6401 Append @var{suffix} to each backup file made with @option{-b}.
6402 @xref{Backup options}.
6404 @itemx @w{@kbd{--target-directory}=@var{directory}}
6405 @opindex --target-directory
6406 @cindex target directory
6407 @cindex destination directory
6408 Specify the destination @var{directory}.
6409 @xref{Target directory}.
6415 Print the name of each file before copying it.
6417 @item -V @var{method}
6418 @itemx --version-control=@var{method}
6420 @opindex --version-control
6421 Change the type of backups made with @option{-b}. The @var{method}
6422 argument can be @samp{none} (or @samp{off}), @samp{numbered} (or
6423 @samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or
6424 @samp{simple}). @xref{Backup options}.
6430 @section @command{mv}: Move (rename) files
6434 @command{mv} moves or renames files (or directories). Synopsis:
6437 mv [@var{option}]@dots{} @var{source} @var{dest}
6438 mv [@var{option}]@dots{} @var{source}@dots{} @var{directory}
6441 If the last argument names an existing directory, @command{mv} moves each
6442 other given file into a file with the same name in that directory.
6443 Otherwise, if only two files are given, it renames the first as
6444 the second. It is an error if the last argument is not a directory
6445 and more than two files are given.
6447 @command{mv} can move any type of file from one filesystem to another.
6448 Prior to version @code{4.0} of the fileutils,
6449 @command{mv} could move only regular files between filesystems.
6450 For example, now @command{mv} can move an entire directory hierarchy
6451 including special device files from one partition to another. It first
6452 uses some of the same code that's used by @code{cp -a} to copy the
6453 requested directories and files, then (assuming the copy succeeded)
6454 it removes the originals. If the copy fails, then the part that was
6455 copied to the destination partition is removed. If you were to copy
6456 three directories from one partition to another and the copy of the first
6457 directory succeeded, but the second didn't, the first would be left on
6458 the destination partition and the second and third would be left on the
6461 @cindex prompting, and @command{mv}
6462 If a destination file exists but is normally unwritable, standard input
6463 is a terminal, and the @option{-f} or @option{--force} option is not given,
6464 @command{mv} prompts the user for whether to replace the file. (You might
6465 own the file, or have write permission on its directory.) If the
6466 response does not begin with @samp{y} or @samp{Y}, the file is skipped.
6468 @emph{Warning}: If you try to move a symlink that points to a directory,
6469 and you specify the symlink with a trailing slash, then @command{mv}
6470 doesn't move the symlink but instead moves the directory referenced
6471 by the symlink. @xref{Trailing slashes}.
6473 The program accepts the following options. Also see @ref{Common options}.
6478 @itemx @w{@kbd{--backup}[=@var{method}]}
6481 @vindex VERSION_CONTROL
6482 @cindex backups, making
6483 @xref{Backup options}.
6484 Make a backup of each file that would otherwise be overwritten or removed.
6490 @cindex prompts, omitting
6491 Do not prompt the user before removing a destination file.
6494 @itemx --interactive
6496 @opindex --interactive
6497 @cindex prompts, forcing
6498 Prompt whether to overwrite each existing destination file, regardless
6499 of its permissions. If the response does not begin with @samp{y} or
6500 @samp{Y}, the file is skipped.
6502 @itemx @w{@kbd{--reply}[=@var{how}]}
6504 @cindex interactivity
6505 Specifying @option{--reply=yes} is equivalent to using @option{--force}.
6506 Specify @option{--reply=no} to make @command{mv} act as if @samp{no} were
6507 given as a response to every prompt about a destination file.
6508 Specify @option{--reply=query} to make @command{mv} prompt the user
6509 about each existing destination file.
6515 @cindex newer files, moving only
6516 Do not move a non-directory that has an existing destination with the
6517 same or newer modification time.
6523 Print the name of each file before moving it.
6525 @itemx @w{@kbd{--strip-trailing-slashes}}
6526 @opindex --strip-trailing-slashes
6527 @cindex stripping trailing slashes
6528 Remove any trailing slashes from each @var{source} argument.
6529 @xref{Trailing slashes}.
6531 @item -S @var{suffix}
6532 @itemx --suffix=@var{suffix}
6535 Append @var{suffix} to each backup file made with @option{-b}.
6536 @xref{Backup options}.
6538 @itemx @w{@kbd{--target-directory}=@var{directory}}
6539 @opindex --target-directory
6540 @cindex target directory
6541 @cindex destination directory
6542 Specify the destination @var{directory}.
6543 @xref{Target directory}.
6545 @item -V @var{method}
6546 @itemx --version-control=@var{method}
6548 @opindex --version-control
6549 Change the type of backups made with @option{-b}. The @var{method}
6550 argument can be @samp{none} (or @samp{off}), @samp{numbered} (or
6551 @samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or
6552 @samp{simple}). @xref{Backup options}.
6558 @section @command{rm}: Remove files or directories
6561 @cindex removing files or directories
6563 @command{rm} removes each given @var{file}. By default, it does not remove
6564 directories. Synopsis:
6567 rm [@var{option}]@dots{} [@var{file}]@dots{}
6570 @cindex prompting, and @command{rm}
6571 If a file is unwritable, standard input is a terminal, and the @option{-f}
6572 or @option{--force} option is not given, or the @option{-i} or
6573 @option{--interactive} option @emph{is} given, @command{rm} prompts the user
6574 for whether to remove the file. If the response does not begin with
6575 @samp{y} or @samp{Y}, the file is skipped.
6577 @emph{Warning}: If you use @command{rm} to remove a file, it is usually
6578 possible to recover the contents of that file. If you want more assurance
6579 that the contents are truly unrecoverable, consider using @command{shred}.
6581 The program accepts the following options. Also see @ref{Common options}.
6588 @opindex --directory
6589 @cindex directories, removing with @code{unlink}
6592 Attempt to remove directories using the @code{unlink} function rather than
6593 the @code{rmdir} function, and
6594 don't require a directory to be empty before trying to unlink it. This works
6595 only if you have appropriate privileges and if your operating system supports
6596 @code{unlink} for directories. Because unlinking a directory causes any files
6597 in the deleted directory to become unreferenced, it is wise to @command{fsck}
6598 the filesystem after doing this.
6604 Ignore nonexistent files and never prompt the user.
6605 Ignore any previous @option{--interactive} (@option{-i}) option.
6608 @itemx --interactive
6610 @opindex --interactive
6611 Prompt whether to remove each file. If the response does not begin
6612 with @samp{y} or @samp{Y}, the file is skipped.
6613 Ignore any previous @option{--force} (@option{-f}) option.
6620 @opindex --recursive
6621 @cindex directories, removing (recursively)
6622 Remove the contents of directories recursively.
6628 Print the name of each file before removing it.
6632 @cindex files beginning with @samp{-}, removing
6633 @cindex @samp{-}, removing files beginning with
6634 One common question is how to remove files whose names begin with a
6635 @samp{-}. @sc{gnu} @command{rm}, like every program that uses the @code{getopt}
6636 function to parse its arguments, lets you use the @samp{--} option to
6637 indicate that all following arguments are non-options. To remove a file
6638 called @file{-f} in the current directory, you could type either:
6651 @opindex - @r{and Unix @command{rm}}
6652 The Unix @command{rm} program's use of a single @samp{-} for this purpose
6653 predates the development of the getopt standard syntax.
6656 @node shred invocation
6657 @section @command{shred}: Remove files more securely
6660 @cindex data, erasing
6661 @cindex erasing data
6663 @command{shred} overwrites devices or files, to help prevent even
6664 very expensive hardware from recovering the data.
6666 Ordinarily when you remove a file (@pxref{rm invocation}), the data is
6667 not actually destroyed. Only the index listing where the file is
6668 stored is destroyed, and the storage is made available for reuse.
6669 There are undelete utilities that will attempt to reconstruct the index
6670 and can bring the file back if the parts were not reused.
6672 On a busy system with a nearly-full drive, space can get reused in a few
6673 seconds. But there is no way to know for sure. If you have sensitive
6674 data, you may want to be sure that recovery is not possible by actually
6675 overwriting the file with non-sensitive data.
6677 However, even after doing that, it is possible to take the disk back
6678 to a laboratory and use a lot of sensitive (and expensive) equipment
6679 to look for the faint ``echoes'' of the original data underneath the
6680 overwritten data. If the data has only been overwritten once, it's not
6683 The best way to remove something irretrievably is to destroy the media
6684 it's on with acid, melt it down, or the like. For cheap removable media
6685 like floppy disks, this is the preferred method. However, hard drives
6686 are expensive and hard to melt, so the @command{shred} utility tries
6687 to achieve a similar effect non-destructively.
6689 This uses many overwrite passes, with the data patterns chosen to
6690 maximize the damage they do to the old data. While this will work on
6691 floppies, the patterns are designed for best effect on hard drives.
6692 For more details, see the source code and Peter Gutmann's paper
6693 @cite{Secure Deletion of Data from Magnetic and Solid-State Memory},
6694 from the proceedings of the Sixth USENIX Security Symposium (San Jose,
6695 California, 22--25 July, 1996). The paper is also available online
6696 @url{http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html}.
6698 @strong{Please note} that @command{shred} relies on a very important assumption:
6699 that the filesystem overwrites data in place. This is the traditional
6700 way to do things, but many modern filesystem designs do not satisfy this
6701 assumption. Exceptions include:
6706 Log-structured or journaled filesystems, such as those supplied with
6707 AIX and Solaris, and JFS, ReiserFS, XFS, Ext3, etc.
6710 Filesystems that write redundant data and carry on even if some writes
6711 fail, such as RAID-based filesystems.
6714 Filesystems that make snapshots, such as Network Appliance's NFS server.
6717 Filesystems that cache in temporary locations, such as NFS version 3
6721 Compressed filesystems.
6724 If you are not sure how your filesystem operates, then you should assume
6725 that it does not overwrite data in place, which means that shred cannot
6726 reliably operate on regular files in your filesystem.
6728 Generally speaking, it is more reliable to shred a device than a file,
6729 since this bypasses the problem of filesystem design mentioned above.
6730 However, even shredding devices is not always completely reliable. For
6731 example, most disks map out bad sectors invisibly to the application; if
6732 the bad sectors contain sensitive data, @command{shred} won't be able to
6735 @command{shred} makes no attempt to detect or report this problem, just as
6736 it makes no attempt to do anything about backups. However, since it is
6737 more reliable to shred devices than files, @command{shred} by default does
6738 not truncate or remove the output file. This default is more suitable
6739 for devices, which typically cannot be truncated and should not be
6742 Finally, consider the risk of backups and mirrors.
6743 File system backups and remote mirrors may contain copies of the
6744 file that cannot be removed, and that will allow a shredded file
6745 to be recovered later. So if you keep any data you may later want
6746 to destroy using @command{shred}, be sure that it is not backed up or mirrored.
6749 shred [@var{option}]@dots{} @var{file}[@dots{}]
6752 The program accepts the following options. Also see @ref{Common options}.
6760 @cindex force deletion
6761 Override file permissions if necessary to allow overwriting.
6764 @itemx -n @var{NUMBER}
6765 @itemx --iterations=@var{NUMBER}
6766 @opindex -n @var{NUMBER}
6767 @opindex --iterations=@var{NUMBER}
6768 @cindex iterations, selecting the number of
6769 By default, @command{shred} uses 25 passes of overwrite. This is enough
6770 for all of the useful overwrite patterns to be used at least once.
6771 You can reduce this to save time, or increase it if you have a lot of
6774 @item -s @var{BYTES}
6775 @itemx --size=@var{BYTES}
6776 @opindex -s @var{BYTES}
6777 @opindex --size=@var{BYTES}
6778 @cindex size of file to shred
6779 Shred the first @var{BYTES} bytes of the file. The default is to shred
6780 the whole file. @var{BYTES} can be followed by a size specification like
6781 @samp{K}, @samp{M}, or @samp{G} to specify a multiple. @xref{Block size}.
6787 @cindex removing files after shredding
6788 After shredding a file, truncate it (if possible) and then remove it.
6789 If a file has multiple links, only the named links will be removed.
6795 Display status updates as sterilization proceeds.
6801 Normally, shred rounds the file size up to the next multiple of
6802 the filesystem block size to fully erase the last block of the file.
6803 This option suppresses that behavior.
6804 Thus, by default if you shred a 10-byte file on a system with 512-byte
6805 blocks, the resulting file will be 512 bytes long. With this option,
6806 shred does not increase the size of the file.
6812 Normally, the last pass that @command{shred} writes is made up of
6813 random data. If this would be conspicuous on your hard drive (for
6814 example, because it looks like encrypted data), or you just think
6815 it's tidier, the @option{--zero} option adds an additional overwrite pass with
6816 all zero bits. This is in addition to the number of passes specified
6817 by the @option{--iterations} option.
6821 Shred standard output.
6823 This argument is considered an option. If the common @samp{--} option has
6824 been used to indicate the end of options on the command line, then @samp{-}
6825 will be interpreted as an ordinary file name.
6827 The intended use of this is to shred a removed temporary file.
6831 i=`tempfile -m 0600`
6834 echo "Hello, world" >&3
6839 Note that the shell command @samp{shred - >file} does not shred the
6840 contents of @var{file}, since it truncates @var{file} before invoking
6841 @command{shred}. Use the command @samp{shred file} or (if using a
6842 Bourne-compatible shell) the command @samp{shred - 1<>file} instead.
6846 You might use the following command to erase all trace of the
6847 filesystem you'd created on the floppy disk in your first drive.
6848 That command takes about 20 minutes to erase a ``1.44MB'' (actually
6852 shred --verbose /dev/fd0
6855 Similarly, to erase all data on a selected partition of
6856 your hard disk, you could give a command like this:
6859 shred --verbose /dev/sda5
6862 @node Special file types
6863 @chapter Special file types
6865 @cindex special file types
6866 @cindex file types, special
6868 This chapter describes commands which create special types of files (and
6869 @command{rmdir}, which removes directories, one special file type).
6871 @cindex special file types
6873 Although Unix-like operating systems have markedly fewer special file
6874 types than others, not @emph{everything} can be treated only as the
6875 undifferentiated byte stream of @dfn{normal files}. For example, when a
6876 file is created or removed, the system must record this information,
6877 which it does in a @dfn{directory}---a special type of file. Although
6878 you can read directories as normal files, if you're curious, in order
6879 for the system to do its job it must impose a structure, a certain
6880 order, on the bytes of the file. Thus it is a ``special'' type of file.
6882 Besides directories, other special file types include named pipes
6883 (FIFOs), symbolic links, sockets, and so-called @dfn{special files}.
6886 * link invocation:: Make a hard link via the link syscall
6887 * ln invocation:: Make links between files.
6888 * mkdir invocation:: Make directories.
6889 * mkfifo invocation:: Make FIFOs (named pipes).
6890 * mknod invocation:: Make block or character special files.
6891 * readlink invocation:: Print the referent of a symbolic link.
6892 * rmdir invocation:: Remove empty directories.
6893 * unlink invocation:: Remove files via the unlink syscall
6897 @node link invocation
6898 @section @command{link}: Make a hard link via the link syscall
6901 @cindex links, creating
6902 @cindex hard links, creating
6903 @cindex creating links (hard only)
6905 @command{link} creates a single hard link at a time.
6906 It is a minimalist interface to the system-provided
6907 @code{link} function. @xref{Hard Links, , , libc,
6908 The GNU C Library Reference Manual}.
6912 link @var{filename} @var{linkname}
6915 @var{filename} must specify an existing file, and @var{linkname}
6916 must specify a nonexistent entry in an existing directory.
6917 @command{link} simply calls @code{link (@var{filename}, @var{linkname})}
6921 @section @command{ln}: Make links between files
6924 @cindex links, creating
6925 @cindex hard links, creating
6926 @cindex symbolic (soft) links, creating
6927 @cindex creating links (hard or soft)
6929 @cindex filesystems and hard links
6930 @command{ln} makes links between files. By default, it makes hard links;
6931 with the @option{-s} option, it makes symbolic (or @dfn{soft}) links.
6935 ln [@var{option}]@dots{} @var{target} [@var{linkname}]
6936 ln [@var{option}]@dots{} @var{target}@dots{} @var{directory}
6941 @item If the last argument names an existing directory, @command{ln} creates a
6942 link to each @var{target} file in that directory, using the
6943 @var{target}s' names. (But see the description of the
6944 @option{--no-dereference} option below.)
6946 @item If two filenames are given, @command{ln} creates a link from the
6947 second to the first.
6949 @item If one @var{target} is given, @command{ln} creates a link to that
6950 file in the current directory.
6952 @item It is an error if the last argument is not a directory and more
6953 than two files are given. Without @option{-f} or @option{-i} (see below),
6954 @command{ln} will not remove an existing file. Use the @option{--backup}
6955 option to make @command{ln} rename existing files.
6959 @cindex hard link, defined
6960 @cindex inode, and hard links
6961 A @dfn{hard link} is another name for an existing file; the link and the
6962 original are indistinguishable. Technically speaking, they share the
6963 same inode, and the inode contains all the information about a
6964 file---indeed, it is not incorrect to say that the inode @emph{is} the
6965 file. On all existing implementations, you cannot make a hard link to
6966 a directory, and hard links cannot cross filesystem boundaries. (These
6967 restrictions are not mandated by @acronym{POSIX}, however.)
6969 @cindex dereferencing symbolic links
6970 @cindex symbolic link, defined
6971 @dfn{Symbolic links} (@dfn{symlinks} for short), on the other hand, are
6972 a special file type (which not all kernels support: System V release 3
6973 (and older) systems lack symlinks) in which the link file actually
6974 refers to a different file, by name. When most operations (opening,
6975 reading, writing, and so on) are passed the symbolic link file, the
6976 kernel automatically @dfn{dereferences} the link and operates on the
6977 target of the link. But some operations (e.g., removing) work on the
6978 link file itself, rather than on its target. @xref{Symbolic Links,,,
6979 libc, The GNU C Library Reference Manual}.
6981 The program accepts the following options. Also see @ref{Common options}.
6986 @itemx @w{@kbd{--backup}[=@var{method}]}
6989 @vindex VERSION_CONTROL
6990 @cindex backups, making
6991 @xref{Backup options}.
6992 Make a backup of each file that would otherwise be overwritten or removed.
6999 @opindex --directory
7000 @cindex hard links to directories
7001 Allow the super-user to make hard links to directories.
7007 Remove existing destination files.
7010 @itemx --interactive
7012 @opindex --interactive
7013 @cindex prompting, and @command{ln}
7014 Prompt whether to remove existing destination files.
7017 @itemx --no-dereference
7019 @opindex --no-dereference
7020 When given an explicit destination that is a symlink to a directory,
7021 treat that destination as if it were a normal file.
7023 When the destination is an actual directory (not a symlink to one),
7024 there is no ambiguity. The link is created in that directory.
7025 But when the specified destination is a symlink to a directory,
7026 there are two ways to treat the user's request. @command{ln} can
7027 treat the destination just as it would a normal directory and create
7028 the link in it. On the other hand, the destination can be viewed as a
7029 non-directory---as the symlink itself. In that case, @command{ln}
7030 must delete or backup that symlink before creating the new link.
7031 The default is to treat a destination that is a symlink to a directory
7032 just like a directory.
7038 Make symbolic links instead of hard links. This option merely produces
7039 an error message on systems that do not support symbolic links.
7041 @item -S @var{suffix}
7042 @itemx --suffix=@var{suffix}
7045 Append @var{suffix} to each backup file made with @option{-b}.
7046 @xref{Backup options}.
7048 @itemx @w{@kbd{--target-directory}=@var{directory}}
7049 @opindex --target-directory
7050 @cindex target directory
7051 @cindex destination directory
7052 Specify the destination @var{directory}.
7053 @xref{Target directory}.
7059 Print the name of each file before linking it.
7061 @item -V @var{method}
7062 @itemx --version-control=@var{method}
7064 @opindex --version-control
7065 Change the type of backups made with @option{-b}. The @var{method}
7066 argument can be @samp{none} (or @samp{off}), @samp{numbered} (or
7067 @samp{t}), @samp{existing} (or @samp{nil}), or @samp{never} (or
7068 @samp{simple}). @xref{Backup options}.
7075 ln -s /some/name # creates link ./name pointing to /some/name
7076 ln -s /some/name myname # creates link ./myname pointing to /some/name
7077 ln -s a b .. # creates links ../a and ../b pointing to ./a and ./b
7081 @node mkdir invocation
7082 @section @command{mkdir}: Make directories
7085 @cindex directories, creating
7086 @cindex creating directories
7088 @command{mkdir} creates directories with the specified names. Synopsis:
7091 mkdir [@var{option}]@dots{} @var{name}@dots{}
7094 If a @var{name} is an existing file but not a directory, @command{mkdir} prints a
7095 warning message on stderr and will exit with a status of 1 after
7096 processing any remaining @var{name}s. The same is done when a @var{name} is an
7097 existing directory and the -p option is not given. If a @var{name} is an
7098 existing directory and the -p option is given, @command{mkdir} will ignore it.
7099 That is, @command{mkdir} will not print a warning, raise an error, or change
7100 the mode of the directory (even if the -m option is given), and will
7101 move on to processing any remaining @var{name}s.
7103 The program accepts the following options. Also see @ref{Common options}.
7108 @itemx --mode=@var{mode}
7111 @cindex modes of created directories, setting
7112 Set the mode of created directories to @var{mode}, which is symbolic as
7113 in @command{chmod} and uses @samp{a=rwx} (read, write and execute allowed for
7114 everyone) minus the bits set in the umask for the point of the
7115 departure. @xref{File permissions}.
7121 @cindex parent directories, creating
7122 Make any missing parent directories for each argument. The mode for parent
7123 directories is set to the umask modified by @samp{u+wx}.
7124 Ignore arguments corresponding to existing directories.
7130 Print a message for each created directory. This is most useful with
7135 @node mkfifo invocation
7136 @section @command{mkfifo}: Make FIFOs (named pipes)
7139 @cindex FIFOs, creating
7140 @cindex named pipes, creating
7141 @cindex creating FIFOs (named pipes)
7143 @command{mkfifo} creates FIFOs (also called @dfn{named pipes}) with the
7144 specified names. Synopsis:
7147 mkfifo [@var{option}] @var{name}@dots{}
7150 A @dfn{FIFO} is a special file type that permits independent processes
7151 to communicate. One process opens the FIFO file for writing, and
7152 another for reading, after which data can flow as with the usual
7153 anonymous pipe in shells or elsewhere.
7155 The program accepts the following option. Also see @ref{Common options}.
7160 @itemx --mode=@var{mode}
7163 @cindex modes of created FIFOs, setting
7164 Set the mode of created FIFOs to @var{mode}, which is symbolic as in
7165 @command{chmod} and uses @samp{a=rw} (read and write allowed for everyone) minus
7166 the bits set in the umask for the point of departure. @xref{File permissions}.
7171 @node mknod invocation
7172 @section @command{mknod}: Make block or character special files
7175 @cindex block special files, creating
7176 @cindex character special files, creating
7178 @command{mknod} creates a FIFO, character special file, or block special
7179 file with the specified name. Synopsis:
7182 mknod [@var{option}]@dots{} @var{name} @var{type} [@var{major} @var{minor}]
7185 @cindex special files
7186 @cindex block special files
7187 @cindex character special files
7188 Unlike the phrase ``special file type'' above, the term @dfn{special
7189 file} has a technical meaning on Unix: something that can generate or
7190 receive data. Usually this corresponds to a physical piece of hardware,
7191 e.g., a printer or a disk. (These files are typically created at
7192 system-configuration time.) The @command{mknod} command is what creates
7193 files of this type. Such devices can be read either a character at a
7194 time or a ``block'' (many characters) at a time, hence we say there are
7195 @dfn{block special} files and @dfn{character special} files.
7197 The arguments after @var{name} specify the type of file to make:
7202 @opindex p @r{for FIFO file}
7206 @opindex b @r{for block special file}
7207 for a block special file
7210 @c Don't document the `u' option -- it's just a synonym for `c'.
7211 @c Do *any* versions of mknod still use it?
7213 @opindex c @r{for character special file}
7214 @c @opindex u @r{for character special file}
7215 for a character special file
7219 When making a block or character special file, the major and minor
7220 device numbers must be given after the file type.
7221 If a major or minor device number begins with @samp{0x} or @samp{0X},
7222 it is interpreted as hexadecimal; otherwise, if it begins with @samp{0},
7223 as octal; otherwise, as decimal.
7225 The program accepts the following option. Also see @ref{Common options}.
7230 @itemx --mode=@var{mode}
7233 Set the mode of created files to @var{mode}, which is symbolic as in
7234 @command{chmod} and uses @samp{a=rw} minus the bits set in the umask as the point
7235 of departure. @xref{File permissions}.
7240 @node readlink invocation
7241 @section @command{readlink}: Print the referent of a symbolic link
7244 @cindex displaying value of a symbolic link
7246 @command{readlink} may work in one of two supported modes:
7252 @command{readlink} outputs the value of the given symbolic link.
7253 If @command{readlink} is invoked with an argument other than the pathname
7254 of a symbolic link, it exits with a non-zero exit code.
7256 @item Canonicalize mode
7258 @command{readlink} outputs the absolute name of the given file which contains
7259 no `.', `..' components nor any repeated path separators (`/') or symlinks.
7260 In any of the path components is missing or unavailable,
7261 it exits with a non-zero exit code.
7266 readlink [@var{option}] @var{file}
7269 By default, @command{readlink} operates in readlink mode.
7271 The program accepts the following options. Also see @ref{Common options}.
7276 @itemx --canonicalize
7278 @opindex --canonicalize
7279 Activate canonicalize mode.
7284 @opindex --no-newline
7285 Do not output the trailing newline.
7295 Suppress most error messages.
7301 Report error messages.
7305 The @command{readlink} utility first appeared in OpenBSD 2.1.
7308 @node rmdir invocation
7309 @section @command{rmdir}: Remove empty directories
7312 @cindex removing empty directories
7313 @cindex directories, removing empty
7315 @command{rmdir} removes empty directories. Synopsis:
7318 rmdir [@var{option}]@dots{} @var{directory}@dots{}
7321 If any @var{directory} argument does not refer to an existing empty
7322 directory, it is an error.
7324 The program accepts the following option. Also see @ref{Common options}.
7328 @item --ignore-fail-on-non-empty
7329 @opindex --ignore-fail-on-non-empty
7330 @cindex directory deletion, ignoring failures
7331 Ignore each failure to remove a directory that is solely because
7332 the directory is non-empty.
7338 @cindex parent directories, removing
7339 Remove @var{directory}, then try to remove each component of @var{directory}.
7340 So, for example, @samp{rmdir -p a/b/c} is similar to @samp{rmdir a/b/c a/b a}.
7341 As such, it fails if any of those directories turns out not to be empty.
7342 Use the @option{--ignore-fail-on-non-empty} option to make it so such
7343 a failure does not evoke a diagnostic and does not cause @command{rmdir} to
7344 exit unsuccessfully.
7350 @cindex directory deletion, reporting
7351 Give a diagnostic for each successful removal.
7352 @var{directory} is removed.
7356 @xref{rm invocation}, for how to remove non-empty directories (recursively).
7358 @node unlink invocation
7359 @section @command{unlink}: Remove files via the unlink syscall
7362 @cindex removing files or directories (via the unlink syscall)
7364 @command{unlink} deletes a single specified file name.
7365 It is a minimalist interface to the system-provided
7366 @code{unlink} function. @xref{Deleting Files, , , libc,
7367 The GNU C Library Reference Manual}. Synopsis:
7370 unlink @var{filename}
7373 On some systems @code{unlink} can be used to delete the name of a
7374 directory. On others, it can be used that way only by a privileged user.
7375 In the GNU system @code{unlink} can never delete the name of a directory.
7377 By default, @command{unlink} honors the @option{--help} and @option{--version}
7378 options. That makes it a little harder to remove files named
7379 @code{--help} and @code{--version}, so when the environment variable
7380 @env{POSIXLY_CORRECT} is set, @command{unlink} treats such a command line
7381 arguments not as an option, but as an operand.
7384 @node Changing file attributes
7385 @chapter Changing file attributes
7387 @cindex changing file attributes
7388 @cindex file attributes, changing
7389 @cindex attributes, file
7391 A file is not merely its contents, a name, and a file type
7392 (@pxref{Special file types}). A file also has an owner (a userid), a
7393 group (a group id), permissions (what the owner can do with the file,
7394 what people in the group can do, and what everyone else can do), various
7395 timestamps, and other information. Collectively, we call these a file's
7398 These commands change file attributes.
7401 * chgrp invocation:: Change file groups.
7402 * chmod invocation:: Change access permissions.
7403 * chown invocation:: Change file owners and groups.
7404 * touch invocation:: Change file timestamps.
7408 @node chown invocation
7409 @section @command{chown}: Change file owner and group
7412 @cindex file ownership, changing
7413 @cindex group ownership, changing
7414 @cindex changing file ownership
7415 @cindex changing group ownership
7417 @command{chown} changes the user and/or group ownership of each given @var{file}
7418 to @var{new-owner} or to the user and group of an existing reference file.
7422 chown [@var{option}]@dots{} @{@var{new-owner} | --reference=@var{ref_file}@} @var{file}@dots{}
7425 If used, @var{new-owner} specifies the new owner and/or group as follows
7426 (with no embedded white space):
7429 [@var{owner}] [ [:] [@var{group}] ]
7436 If only an @var{owner} (a user name or numeric user id) is given, that
7437 user is made the owner of each given file, and the files' group is not
7440 @itemx owner@samp{:}group
7441 If the @var{owner} is followed by a colon and a @var{group} (a
7442 group name or numeric group id), with no spaces between them, the group
7443 ownership of the files is changed as well (to @var{group}).
7445 @itemx owner@samp{:}
7446 If a colon but no group name follows @var{owner}, that user is
7447 made the owner of the files and the group of the files is changed to
7448 @var{owner}'s login group.
7450 @itemx @samp{:}group
7451 If the colon and following @var{group} are given, but the owner
7452 is omitted, only the group of the files is changed; in this case,
7453 @command{chown} performs the same function as @command{chgrp}.
7457 You may use @samp{.} in place of the @samp{:} separator. This is a
7458 @sc{gnu} extension for compatibility with older scripts.
7459 New scripts should avoid the use of @samp{.} because @sc{gnu} @command{chown}
7460 may fail if @var{owner} contains @samp{.} characters.
7462 The program accepts the following options. Also see @ref{Common options}.
7470 @cindex changed owners, verbosely describing
7471 Verbosely describe the action for each @var{file} whose ownership
7480 @cindex error messages, omitting
7481 Do not print error messages about files whose ownership cannot be
7484 @itemx @w{@kbd{--from}=@var{old-owner}}
7486 @cindex symbolic links, changing owner
7487 Change a @var{file}'s ownership only if it has current attributes specified
7488 by @var{old-owner}. @var{old-owner} has the same form as @var{new-owner}
7490 This option is useful primarily from a security standpoint in that
7491 it narrows considerably the window of potential abuse.
7492 For example, to reflect a UID numbering change for one user's files
7493 without an option like this, @code{root} might run
7496 find / -owner OLDUSER -print0 | xargs -0 chown NEWUSER
7499 But that is dangerous because the interval between when the @code{find}
7500 tests the existing file's owner and when the @command{chown} is actually run
7502 One way to narrow the gap would be to invoke chown for each file
7506 find / -owner OLDUSER -exec chown NEWUSER @{@} \;
7509 But that is very slow if there are many affected files.
7510 With this option, it is safer (the gap is narrower still)
7511 though still not perfect:
7514 chown -R --from=OLDUSER NEWUSER /
7518 @opindex --dereference
7519 @cindex symbolic links, changing owner
7521 Do not act on symbolic links themselves but rather on what they point to.
7524 @itemx --no-dereference
7526 @opindex --no-dereference
7527 @cindex symbolic links, changing owner
7529 Act on symbolic links themselves instead of what they point to.
7530 This is the default.
7531 This mode relies on the @code{lchown} system call.
7532 On systems that do not provide the @code{lchown} system call,
7533 @command{chown} fails when a file specified on the command line
7535 By default, no diagnostic is issued for symbolic links encountered
7536 during a recursive traversal, but see @option{--verbose}.
7538 @item --reference=@var{ref_file}
7539 @opindex --reference
7540 Change the user and group of each @var{file} to be the same as those of
7541 @var{ref_file}. If @var{ref_file} is a symbolic link, do not use the
7542 user and group of the symbolic link, but rather those of the file it
7549 Output a diagnostic for every file processed.
7550 If a symbolic link is encountered during a recursive traversal
7551 on a system without the @code{lchown} system call, and @option{--no-dereference}
7552 is in effect, then issue a diagnostic saying neither the symbolic link nor
7553 its referent is being changed.
7558 @opindex --recursive
7559 @cindex recursively changing file ownership
7560 Recursively change ownership of directories and their contents.
7565 @node chgrp invocation
7566 @section @command{chgrp}: Change group ownership
7569 @cindex group ownership, changing
7570 @cindex changing group ownership
7572 @command{chgrp} changes the group ownership of each given @var{file}
7573 to @var{group} (which can be either a group name or a numeric group id)
7574 or to the group of an existing reference file. Synopsis:
7577 chgrp [@var{option}]@dots{} @{@var{group} | --reference=@var{ref_file}@} @var{file}@dots{}
7580 The program accepts the following options. Also see @ref{Common options}.
7588 @cindex changed files, verbosely describing
7589 Verbosely describe the action for each @var{file} whose group actually
7598 @cindex error messages, omitting
7599 Do not print error messages about files whose group cannot be
7603 @opindex --dereference
7604 @cindex symbolic links, changing owner
7606 Do not act on symbolic links themselves but rather on what they point to.
7609 @itemx --no-dereference
7611 @opindex --no-dereference
7612 @cindex symbolic links, changing group
7614 Act on symbolic links themselves instead of what they point to.
7615 This is the default.
7616 This mode relies on the @code{lchown} system call.
7617 On systems that do not provide the @code{lchown} system call,
7618 @command{chgrp} fails when a file specified on the command line
7620 By default, no diagnostic is issued for symbolic links encountered
7621 during a recursive traversal, but see @option{--verbose}.
7623 @item --reference=@var{ref_file}
7624 @opindex --reference
7625 Change the group of each @var{file} to be the same as that of
7626 @var{ref_file}. If @var{ref_file} is a symbolic link, do not use the
7627 group of the symbolic link, but rather that of the file it refers to.
7633 Output a diagnostic for every file processed.
7634 If a symbolic link is encountered during a recursive traversal
7635 on a system without the @code{lchown} system call, and @option{--no-dereference}
7636 is in effect, then issue a diagnostic saying neither the symbolic link nor
7637 its referent is being changed.
7642 @opindex --recursive
7643 @cindex recursively changing group ownership
7644 Recursively change the group ownership of directories and their contents.
7649 @node chmod invocation
7650 @section @command{chmod}: Change access permissions
7653 @cindex changing access permissions
7654 @cindex access permissions, changing
7655 @cindex permissions, changing access
7657 @command{chmod} changes the access permissions of the named files. Synopsis:
7660 chmod [@var{option}]@dots{} @{@var{mode} | --reference=@var{ref_file}@} @var{file}@dots{}
7663 @cindex symbolic links, permissions of
7664 @command{chmod} never changes the permissions of symbolic links, since
7665 the @command{chmod} system call cannot change their permissions.
7666 This is not a problem since the permissions of symbolic links are
7667 never used. However, for each symbolic link listed on the command
7668 line, @command{chmod} changes the permissions of the pointed-to file.
7669 In contrast, @command{chmod} ignores symbolic links encountered during
7670 recursive directory traversals.
7672 If used, @var{mode} specifies the new permissions.
7673 For details, see the section on @ref{File permissions}.
7675 The program accepts the following options. Also see @ref{Common options}.
7683 Verbosely describe the action for each @var{file} whose permissions
7692 @cindex error messages, omitting
7693 Do not print error messages about files whose permissions cannot be
7700 Verbosely describe the action or non-action taken for every @var{file}.
7702 @item --reference=@var{ref_file}
7703 @opindex --reference
7704 Change the mode of each @var{file} to be the same as that of @var{ref_file}.
7705 @xref{File permissions}.
7706 If @var{ref_file} is a symbolic link, do not use the mode
7707 of the symbolic link, but rather that of the file it refers to.
7712 @opindex --recursive
7713 @cindex recursively changing access permissions
7714 Recursively change permissions of directories and their contents.
7719 @node touch invocation
7720 @section @command{touch}: Change file timestamps
7723 @cindex changing file timestamps
7724 @cindex file timestamps, changing
7725 @cindex timestamps, changing file
7727 @command{touch} changes the access and/or modification times of the
7728 specified files. Synopsis:
7731 touch [@var{option}]@dots{} @var{file}@dots{}
7734 On older systems, @command{touch} supports an obsolete syntax, as follows.
7735 If the first @var{file} would be a valid argument to the @option{-t}
7736 option and no timestamp is given with any of the @option{-d}, @option{-r},
7737 or @option{-t} options and the @samp{--} argument is not given, that
7738 argument is interpreted as the time for the other files instead of
7739 as a file name. @acronym{POSIX} 1003.1-2001 (@pxref{Standards conformance})
7740 does not allow this; use @option{-t} instead.
7742 @cindex empty files, creating
7743 Any @var{file} that does not exist is created empty.
7745 @cindex permissions, for changing file timestamps
7746 If changing both the access and modification times to the current
7747 time, @command{touch} can change the timestamps for files that the user
7748 running it does not own but has write permission for. Otherwise, the
7749 user must own the files.
7751 Although @command{touch} provides options for changing two of the times --
7752 the times of last access and modification -- of a file, there is actually
7753 a third one as well: the inode change time. This is often referred to
7754 as a file's @code{ctime}.
7755 The inode change time represents the time when the file's meta-information
7756 last changed. One common example of this is when the permissions of a
7757 file change. Changing the permissions doesn't access the file, so
7758 the atime doesn't change, nor does it modify the file, so the mtime
7759 doesn't change. Yet, something about the file itself has changed,
7760 and this must be noted somewhere. This is the job of the ctime field.
7761 This is necessary, so that, for example, a backup program can make a
7762 fresh copy of the file, including the new permissions value.
7763 Another operation that modifies a file's ctime without affecting
7764 the others is renaming. In any case, it is not possible, in normal
7765 operations, for a user to change the ctime field to a user-specified value.
7767 The program accepts the following options. Also see @ref{Common options}.
7773 @itemx --time=access
7777 @opindex atime@r{, changing}
7778 @opindex access @r{time, changing}
7779 @opindex use @r{time, changing}
7780 Change the access time only.
7785 @opindex --no-create
7786 Do not create files that do not exist.
7793 Use @var{time} instead of the current time. It can contain month names,
7794 time zones, @samp{am} and @samp{pm}, etc. @xref{Date input formats}.
7798 @cindex BSD @command{touch} compatibility
7799 Ignored; for compatibility with BSD versions of @command{touch}.
7803 @itemx --time=modify
7806 @opindex mtime@r{, changing}
7807 @opindex modify @r{time, changing}
7808 Change the modification time only.
7811 @itemx --reference=@var{file}
7813 @opindex --reference
7814 Use the times of the reference @var{file} instead of the current time.
7816 @item -t [[CC]YY]MMDDhhmm[.ss]
7817 Use the argument (optional four-digit or two-digit years, months,
7818 days, hours, minutes, optional seconds) instead of the current time.
7819 If the year is specified with only two digits, then @var{CC}
7820 is 20 for years in the range 0 @dots{} 68, and 19 for years in
7821 69 @dots{} 99. If no digits of the year are specified,
7822 the argument is interpreted as a date in the current year.
7832 No disk can hold an infinite amount of data. These commands report on
7833 how much disk storage is in use or available. (This has nothing much to
7834 do with how much @emph{main memory}, i.e., RAM, a program is using when
7835 it runs; for that, you want @code{ps} or @code{pstat} or @code{swap}
7836 or some such command.)
7839 * df invocation:: Report filesystem disk space usage.
7840 * du invocation:: Estimate file space usage.
7841 * stat invocation:: Report file or filesystem status.
7842 * sync invocation:: Synchronize memory and disk.
7847 @section @command{df}: Report filesystem disk space usage
7850 @cindex filesystem disk usage
7851 @cindex disk usage by filesystem
7853 @command{df} reports the amount of disk space used and available on
7854 filesystems. Synopsis:
7857 df [@var{option}]@dots{} [@var{file}]@dots{}
7860 With no arguments, @command{df} reports the space used and available on all
7861 currently mounted filesystems (of all types). Otherwise, @command{df}
7862 reports on the filesystem containing each argument @var{file}.
7864 Normally the disk space is printed in units of
7865 1024 bytes, but this can be overridden (@pxref{Block size}).
7866 Non-integer quantities are rounded up to the next higher unit.
7868 @cindex disk device file
7869 @cindex device file, disk
7870 If an argument @var{file} is a disk device file containing a mounted
7871 filesystem, @command{df} shows the space available on that filesystem
7872 rather than on the filesystem containing the device node (i.e., the root
7873 filesystem). @sc{gnu} @command{df} does not attempt to determine the disk usage
7874 on unmounted filesystems, because on most kinds of systems doing so
7875 requires extremely nonportable intimate knowledge of filesystem
7878 The program accepts the following options. Also see @ref{Common options}.
7886 @cindex automounter filesystems
7887 @cindex ignore filesystems
7888 Include in the listing filesystems that have a size of 0 blocks, which
7889 are omitted by default. Such filesystems are typically special-purpose
7890 pseudo-filesystems, such as automounter entries. Also, filesystems of
7891 type ``ignore'' or ``auto'', supported by some operating systems, are
7892 only included if this option is specified.
7895 @itemx --block-size=@var{size}
7897 @opindex --block-size
7898 @cindex filesystem sizes
7899 Scale sizes by @var{size} before printing them (@pxref{Block size}).
7900 For example, @option{-BG} prints sizes in units of 1,073,741,824 bytes.
7903 @itemx --human-readable
7905 @opindex --human-readable
7906 @cindex human-readable output
7907 Append a size letter to each size, such as @samp{M} for mebibytes.
7908 Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes.
7909 Use the @option{-H} or @option{--si} option if you prefer powers of 1000.
7916 Append an SI-style abbreviation to each size, such as @samp{MB} for
7917 megabytes. Powers of 1000 are used, not 1024; @samp{MB} stands for
7918 1,000,000 bytes. Use the @option{-h} or @option{--human-readable} option if
7919 you prefer powers of 1024.
7926 List inode usage information instead of block usage. An inode (short
7927 for index node) contains information about a file such as its owner,
7928 permissions, timestamps, and location on the disk.
7932 @cindex kibibytes for filesystem sizes
7933 Print sizes in 1024-byte blocks, overriding the default block size
7934 (@pxref{Block size}).
7935 This option is equivalent to @option{--block-size=1K}.
7941 @cindex filesystem types, limiting output to certain
7942 Limit the listing to local filesystems. By default, remote filesystems
7947 @cindex filesystem space, retrieving old data more quickly
7948 Do not invoke the @code{sync} system call before getting any usage data.
7949 This may make @command{df} run significantly faster on systems with many
7950 disks, but on some systems (notably SunOS) the results may be slightly
7951 out of date. This is the default.
7954 @itemx --portability
7956 @opindex --portability
7957 @cindex one-line output format
7958 @cindex @acronym{POSIX} output format
7959 @cindex portable output format
7960 @cindex output format, portable
7961 Use the @acronym{POSIX} output format. This is like the default format except
7966 The information about each filesystem is always printed on exactly
7967 one line; a mount device is never put on a line by itself. This means
7968 that if the mount device name is more than 20 characters long (e.g., for
7969 some network mounts), the columns are misaligned.
7972 The labels in the header output line are changed to conform to @acronym{POSIX}.
7977 @cindex filesystem space, retrieving current data more slowly
7978 Invoke the @code{sync} system call before getting any usage data. On
7979 some systems (notably SunOS), doing this yields more up to date results,
7980 but in general this option makes @command{df} much slower, especially when
7981 there are many or very busy filesystems.
7983 @item -t @var{fstype}
7984 @itemx --type=@var{fstype}
7987 @cindex filesystem types, limiting output to certain
7988 Limit the listing to filesystems of type @var{fstype}. Multiple
7989 filesystem types can be specified by giving multiple @option{-t} options.
7990 By default, nothing is omitted.
7995 @opindex --print-type
7996 @cindex filesystem types, printing
7997 Print each filesystem's type. The types printed here are the same ones
7998 you can include or exclude with @option{-t} and @option{-x}. The particular
7999 types printed are whatever is supported by the system. Here are some of
8000 the common names (this list is certainly not exhaustive):
8005 @cindex NFS filesystem type
8006 An NFS filesystem, i.e., one mounted over a network from another
8007 machine. This is the one type name which seems to be used uniformly by
8010 @item 4.2@r{, }ufs@r{, }efs@dots{}
8011 @cindex Linux filesystem types
8012 @cindex local filesystem types
8013 @opindex 4.2 @r{filesystem type}
8014 @opindex ufs @r{filesystem type}
8015 @opindex efs @r{filesystem type}
8016 A filesystem on a locally-mounted hard disk. (The system might even
8017 support more than one type here; Linux does.)
8019 @item hsfs@r{, }cdfs
8020 @cindex CD-ROM filesystem type
8021 @cindex High Sierra filesystem
8022 @opindex hsfs @r{filesystem type}
8023 @opindex cdfs @r{filesystem type}
8024 A filesystem on a CD-ROM drive. HP-UX uses @samp{cdfs}, most other
8025 systems use @samp{hsfs} (@samp{hs} for ``High Sierra'').
8028 @cindex PC filesystem
8029 @cindex DOS filesystem
8030 @cindex MS-DOS filesystem
8031 @cindex diskette filesystem
8033 An MS-DOS filesystem, usually on a diskette.
8037 @item -x @var{fstype}
8038 @itemx --exclude-type=@var{fstype}
8040 @opindex --exclude-type
8041 Limit the listing to filesystems not of type @var{fstype}.
8042 Multiple filesystem types can be eliminated by giving multiple
8043 @option{-x} options. By default, no filesystem types are omitted.
8046 Ignored; for compatibility with System V versions of @command{df}.
8052 @section @command{du}: Estimate file space usage
8055 @cindex file space usage
8056 @cindex disk usage for files
8058 @command{du} reports the amount of disk space used by the specified files
8059 and for each subdirectory (of directory arguments). Synopsis:
8062 du [@var{option}]@dots{} [@var{file}]@dots{}
8065 With no arguments, @command{du} reports the disk space for the current
8066 directory. Normally the disk space is printed in units of
8067 1024 bytes, but this can be overridden (@pxref{Block size}).
8068 Non-integer quantities are rounded up to the next higher unit.
8070 The program accepts the following options. Also see @ref{Common options}.
8078 Show counts for all files, not just directories.
8084 Print sizes in bytes, overriding the default block size (@pxref{Block size}).
8087 @itemx --block-size=@var{size}
8089 @opindex --block-size
8091 Scale sizes by @var{size} before printing them (@pxref{Block size}).
8092 For example, @option{-BG} prints sizes in units of 1,073,741,824 bytes.
8098 @cindex grand total of disk space
8099 Print a grand total of all arguments after all arguments have
8100 been processed. This can be used to find out the total disk usage of
8101 a given set of files or directories.
8104 @itemx --dereference-args
8106 @opindex --dereference-args
8107 Dereference symbolic links that are command line arguments.
8108 Does not affect other symbolic links. This is helpful for finding
8109 out the disk usage of directories, such as @file{/usr/tmp}, which
8110 are often symbolic links.
8113 @itemx --human-readable
8115 @opindex --human-readable
8116 @cindex human-readable output
8117 Append a size letter to each size, such as @samp{M} for mebibytes.
8118 Powers of 1024 are used, not 1000; @samp{M} stands for 1,048,576 bytes.
8119 Use the @option{-H} or @option{--si} option if you prefer powers of 1000.
8126 Append an SI-style abbreviation to each size, such as @samp{MB} for
8127 megabytes. Powers of 1000 are used, not 1024; @samp{MB} stands for
8128 1,000,000 bytes. Use the @option{-h} or @option{--human-readable} option if
8129 you prefer powers of 1024.
8133 @cindex kibibytes for file sizes
8134 Print sizes in 1024-byte blocks, overriding the default block size
8135 (@pxref{Block size}).
8136 This option is equivalent to @option{--block-size=1K}.
8139 @itemx --count-links
8141 @opindex --count-links
8142 @cindex hard links, counting in @command{du}
8143 Count the size of all files, even if they have appeared already (as a
8147 @itemx --dereference
8149 @opindex --dereference
8150 @cindex symbolic links, dereferencing in @command{du}
8151 Dereference symbolic links (show the disk space used by the file
8152 or directory that the link points to instead of the space used by
8155 @item --max-depth=@var{DEPTH}
8156 @opindex --max-depth=@var{DEPTH}
8157 @cindex limiting output of @command{du}
8158 Show the total for each directory (and file if --all) that is at
8159 most MAX_DEPTH levels down from the root of the hierarchy. The root
8160 is at level 0, so @code{du --max-depth=0} is equivalent to @code{du -s}.
8165 @opindex --summarize
8166 Display only a total for each argument.
8169 @itemx --separate-dirs
8171 @opindex --separate-dirs
8172 Report the size of each directory separately, not including the sizes
8176 @itemx --one-file-system
8178 @opindex --one-file-system
8179 @cindex one filesystem, restricting @command{du} to
8180 Skip directories that are on different filesystems from the one that
8181 the argument being processed is on.
8183 @item --exclude=@var{PATTERN}
8184 @opindex --exclude=@var{PATTERN}
8185 @cindex excluding files from @command{du}
8186 When recursing, skip subdirectories or files matching @var{PATTERN}.
8187 For example, @code{du --exclude='*.o'} excludes files whose names
8191 @itemx --exclude-from=@var{FILE}
8192 @opindex -X @var{FILE}
8193 @opindex --exclude-from=@var{FILE}
8194 @cindex excluding files from @command{du}
8195 Like @option{--exclude}, except take the patterns to exclude from @var{FILE},
8196 one per line. If @var{FILE} is @samp{-}, take the patterns from standard
8201 @cindex NFS mounts from BSD to HP-UX
8202 On BSD systems, @command{du} reports sizes that are half the correct
8203 values for files that are NFS-mounted from HP-UX systems. On HP-UX
8204 systems, it reports sizes that are twice the correct values for
8205 files that are NFS-mounted from BSD systems. This is due to a flaw
8206 in HP-UX; it also affects the HP-UX @command{du} program.
8209 @node stat invocation
8210 @section @command{stat}: Report file or filesystem status
8214 @cindex filesystem status
8216 @command{stat} displays information about the specified file(s). Synopsis:
8219 stat [@var{option}]@dots{} [@var{file}]@dots{}
8222 With no option, @command{stat} reports all information about the given files.
8223 But it also can be used to report the information of the filesystems the
8224 given files are located on. If the files are links, @command{stat} can
8225 also give information about the files the links point to.
8233 @opindex --filesystem
8235 Report information about the filesystems where the given files are located
8236 instead of information about the files themselves.
8239 @itemx --dereference
8241 @opindex --dereference
8242 @cindex symbolic links, dereferencing in @command{stat}
8243 Change how @command{stat} treats symbolic links.
8244 With this option, @command{stat} acts on the file referenced
8245 by each symbolic link argument.
8246 Without it, @command{stat} acts on any symbolic link argument directly.
8252 @cindex terse output
8253 Print the information in terse form, suitable for parsing by other programs.
8259 @cindex output format
8260 Allow user to specify the output format.
8262 Interpreted sequences for file stat are:
8264 @item %n - File name
8265 @item %N - Quoted File name with dereference if symbolic link
8266 @item %d - Device number in decimal
8267 @item %D - Device number in hex
8268 @item %i - Inode number
8269 @item %a - Access rights in octal
8270 @item %A - Access rights in human readable form
8271 @item %f - raw mode in hex
8272 @item %F - File type
8273 @item %h - Number of hard links
8274 @item %u - User Id of owner
8275 @item %U - User name of owner
8276 @item %g - Group Id of owner
8277 @item %G - Group name of owner
8278 @item %t - Major device type in hex
8279 @item %T - Minor device type in hex
8280 @item %s - Total size, in bytes
8281 @item %b - Number of blocks allocated
8282 @item %o - IO block size
8283 @item %x - Time of last access
8284 @item %X - Time of last access as seconds since Epoch
8285 @item %y - Time of last modification
8286 @item %Y - Time of last modification as seconds since Epoch
8287 @item %z - Time of last change
8288 @item %Z - Time of last change as seconds since Epoch
8291 Interpreted sequences for filesystem stat are:
8293 @item %n - File name
8294 @item %i - File System id in hex
8295 @item %l - Maximum length of filenames
8296 @item %t - Type in hex
8297 @item %T - Type in human readable form
8298 @item %b - Total data blocks in file system
8299 @item %f - Free blocks in file system
8300 @item %a - Free blocks available to non-superuser
8301 @item %s - Optimal transfer block size
8302 @item %c - Total file nodes in file system
8307 @node sync invocation
8308 @section @command{sync}: Synchronize data on disk with memory
8311 @cindex synchronize disk and memory
8313 @cindex superblock, writing
8314 @cindex inodes, written buffered
8315 @command{sync} writes any data buffered in memory out to disk. This can
8316 include (but is not limited to) modified superblocks, modified inodes,
8317 and delayed reads and writes. This must be implemented by the kernel;
8318 The @command{sync} program does nothing but exercise the @code{sync} system
8321 @cindex crashes and corruption
8322 The kernel keeps data in memory to avoid doing (relatively slow) disk
8323 reads and writes. This improves performance, but if the computer
8324 crashes, data may be lost or the filesystem corrupted as a
8325 result. @command{sync} ensures everything in memory is written to disk.
8327 Any arguments are ignored, except for a lone @option{--help} or
8328 @option{--version} (@pxref{Common options}).
8331 @chapter Printing text
8333 @cindex printing text, commands for
8334 @cindex commands for printing text
8336 This section describes commands that display text strings.
8339 * echo invocation:: Print a line of text.
8340 * printf invocation:: Format and print data.
8341 * yes invocation:: Print a string until interrupted.
8345 @node echo invocation
8346 @section @command{echo}: Print a line of text
8349 @cindex displaying text
8350 @cindex printing text
8351 @cindex text, displaying
8352 @cindex arbitrary text, displaying
8354 @command{echo} writes each given @var{string} to standard output, with a
8355 space between each and a newline after the last one. Synopsis:
8358 echo [@var{option}]@dots{} [@var{string}]@dots{}
8361 The program accepts the following options. Also see @ref{Common options}.
8366 Do not output the trailing newline.
8370 @cindex backslash escapes
8371 Enable interpretation of the following backslash-escaped characters in
8380 suppress trailing newline
8394 the character whose @acronym{ASCII} code is @var{nnn} (octal); if @var{nnn} is not
8395 a valid octal number, it is printed literally.
8401 @node printf invocation
8402 @section @command{printf}: Format and print data
8405 @command{printf} does formatted printing of text. Synopsis:
8408 printf @var{format} [@var{argument}]@dots{}
8411 @command{printf} prints the @var{format} string, interpreting @samp{%}
8412 directives and @samp{\} escapes in the same way as the C @command{printf}
8413 function. The @var{format} argument is re-used as necessary to convert
8414 all of the given @var{argument}s.
8416 @command{printf} has one additional directive, @samp{%b}, which prints its
8417 argument string with @samp{\} escapes interpreted in the same way as in
8418 the @var{format} string.
8422 @command{printf} interprets @samp{\0ooo} in @var{format} as an octal number
8423 (if @var{ooo} is 0 to 3 octal digits) specifying a character to print,
8424 and @samp{\xhh} as a hexadecimal number (if @var{hh} is 1 to 2 hex
8425 digits) specifying a character to print.
8429 @command{printf} interprets two character syntaxes introduced in ISO C 99:
8430 @samp{\u} for 16-bit Unicode characters, specified as 4 hex digits
8431 @var{hhhh}, and @samp{\U} for 32-bit Unicode characters, specified as 8 hex
8432 digits @var{hhhhhhhh}. @command{printf} outputs the Unicode characters
8433 according to the LC_CTYPE part of the current locale, i.e. depending
8434 on the values of the environment variables @code{LC_ALL}, @code{LC_CTYPE},
8437 The processing of @samp{\u} and @samp{\U} requires a full-featured
8438 @code{iconv} facility. It is activated on systems with glibc 2.2 (or newer),
8439 or when @code{libiconv} is installed prior to this package. Otherwise the
8440 use of @samp{\u} and @samp{\U} will give an error message.
8443 An additional escape, @samp{\c}, causes @command{printf} to produce no
8446 The only options are a lone @option{--help} or
8447 @option{--version}. @xref{Common options}.
8449 The Unicode character syntaxes are useful for writing strings in a locale
8450 independent way. For example, a string containing the Euro currency symbol
8453 $ /usr/local/bin/printf '\u20AC 14.95'
8457 will be output correctly in all locales supporting the Euro symbol
8458 (ISO-8859-15, UTF-8, and others). Similarly, a Chinese string
8461 $ /usr/local/bin/printf '\u4e2d\u6587'
8465 will be output correctly in all Chinese locales (GB2312, BIG5, UTF-8, etc).
8467 Note that in these examples, the full pathname of @command{printf} has been
8468 given, to distinguish it from the GNU @code{bash} builtin function
8471 For larger strings, you don't need to look up the hexadecimal code
8472 values of each character one by one. @acronym{ASCII} characters mixed with \u
8473 escape sequences is also known as the JAVA source file encoding. You can
8474 use GNU recode 3.5c (or newer) to convert strings to this encoding. Here
8475 is how to convert a piece of text into a shell script which will output
8476 this text in a locale-independent way:
8479 $ LC_CTYPE=zh_CN.big5 /usr/local/bin/printf \
8480 '\u4e2d\u6587\n' > sample.txt
8481 $ recode BIG5..JAVA < sample.txt \
8482 | sed -e "s|^|/usr/local/bin/printf '|" -e "s|$|\\\\n'|" \
8487 @node yes invocation
8488 @section @command{yes}: Print a string until interrupted
8491 @cindex repeated output of a string
8493 @command{yes} prints the command line arguments, separated by spaces and
8494 followed by a newline, forever until it is killed. If no arguments are
8495 given, it prints @samp{y} followed by a newline forever until killed.
8497 The only options are a lone @option{--help} or @option{--version}.
8498 @xref{Common options}.
8505 @cindex commands for exit status
8506 @cindex exit status commands
8508 This section describes commands that are primarily useful for their exit
8509 status, rather than their output. Thus, they are often used as the
8510 condition of shell @code{if} statements, or as the last command in a
8514 * false invocation:: Do nothing, unsuccessfully.
8515 * true invocation:: Do nothing, successfully.
8516 * test invocation:: Check file types and compare values.
8517 * expr invocation:: Evaluate expressions.
8521 @node false invocation
8522 @section @command{false}: Do nothing, unsuccessfully
8525 @cindex do nothing, unsuccessfully
8526 @cindex failure exit status
8527 @cindex exit status of @command{false}
8529 @command{false} does nothing except return an exit status of 1, meaning
8530 @dfn{failure}. It can be used as a place holder in shell scripts
8531 where an unsuccessful command is needed.
8533 By default, @command{false} honors the @option{--help} and @option{--version}
8534 options. However, that is contrary to @acronym{POSIX}, so when the environment
8535 variable @env{POSIXLY_CORRECT} is set, @command{false} ignores @emph{all}
8536 command line arguments, including @option{--help} and @option{--version}.
8538 This version of @command{false} is implemented as a C program, and is thus
8539 more secure and faster than a shell script implementation, and may safely
8540 be used as a dummy shell for the purpose of disabling accounts.
8543 @node true invocation
8544 @section @command{true}: Do nothing, successfully
8547 @cindex do nothing, successfully
8549 @cindex successful exit
8550 @cindex exit status of @command{true}
8552 @command{true} does nothing except return an exit status of 0, meaning
8553 @dfn{success}. It can be used as a place holder in shell scripts
8554 where a successful command is needed, although the shell built-in
8555 command @code{:} (colon) may do the same thing faster.
8556 In most modern shells, @command{true} is a built-in command, so when
8557 you use @samp{true} in a script, you're probably using the built-in
8558 command, not the one documented here.
8560 By default, @command{true} honors the @option{--help} and @option{--version}
8561 options. However, that is contrary to @acronym{POSIX}, so when the environment
8562 variable @env{POSIXLY_CORRECT} is set, @command{true} ignores @emph{all}
8563 command line arguments, including @option{--help} and @option{--version}.
8565 This version of @command{true} is implemented as a C program, and is thus
8566 more secure and faster than a shell script implementation, and may safely
8567 be used as a dummy shell for the purpose of disabling accounts.
8569 @node test invocation
8570 @section @command{test}: Check file types and compare values
8573 @cindex check file types
8574 @cindex compare values
8575 @cindex expression evaluation
8577 @command{test} returns a status of 0 (true) or 1 (false) depending on the
8578 evaluation of the conditional expression @var{expr}. Each part of the
8579 expression must be a separate argument.
8581 @command{test} has file status checks, string operators, and numeric
8582 comparison operators.
8584 @cindex conflicts with shell built-ins
8585 @cindex built-in shell commands, conflicts with
8586 Because most shells have a built-in command by the same name, using the
8587 unadorned command name in a script or interactively may get you
8588 different functionality than that described here.
8590 Besides the options below, @command{test} accepts a lone @option{--help} or
8591 @option{--version}. @xref{Common options}. A single non-option argument
8592 is also allowed: @command{test} returns true if the argument is not null.
8595 * File type tests:: -[bcdfhLpSt]
8596 * Access permission tests:: -[gkruwxOG]
8597 * File characteristic tests:: -e -s -nt -ot -ef
8598 * String tests:: -z -n = !=
8599 * Numeric tests:: -eq -ne -lt -le -gt -ge
8600 * Connectives for test:: ! -a -o
8604 @node File type tests
8605 @subsection File type tests
8607 @cindex file type tests
8609 These options test for particular types of files. (Everything's a file,
8610 but not all files are the same!)
8616 @cindex block special check
8617 True if @var{file} exists and is a block special device.
8621 @cindex character special check
8622 True if @var{file} exists and is a character special device.
8626 @cindex directory check
8627 True if @var{file} exists and is a directory.
8631 @cindex regular file check
8632 True if @var{file} exists and is a regular file.
8635 @itemx -L @var{file}
8638 @cindex symbolic link check
8639 True if @var{file} exists and is a symbolic link.
8643 @cindex named pipe check
8644 True if @var{file} exists and is a named pipe.
8648 @cindex socket check
8649 True if @var{file} exists and is a socket.
8653 @cindex terminal check
8654 True if @var{fd} is opened on a terminal. If @var{fd} is omitted, it
8655 defaults to 1 (standard output).
8660 @node Access permission tests
8661 @subsection Access permission tests
8663 @cindex access permission tests
8664 @cindex permission tests
8666 These options test for particular access permissions.
8672 @cindex set-group-id check
8673 True if @var{file} exists and has its set-group-id bit set.
8677 @cindex sticky bit check
8678 True if @var{file} has its @dfn{sticky} bit set.
8682 @cindex readable file check
8683 True if @var{file} exists and is readable.
8687 @cindex set-user-id check
8688 True if @var{file} exists and has its set-user-id bit set.
8692 @cindex writable file check
8693 True if @var{file} exists and is writable.
8697 @cindex executable file check
8698 True if @var{file} exists and is executable.
8702 @cindex owned by effective uid check
8703 True if @var{file} exists and is owned by the current effective user id.
8707 @cindex owned by effective gid check
8708 True if @var{file} exists and is owned by the current effective group id.
8712 @node File characteristic tests
8713 @subsection File characteristic tests
8715 @cindex file characteristic tests
8717 These options test other file characteristics.
8723 @cindex existence-of-file check
8724 True if @var{file} exists.
8728 @cindex nonempty file check
8729 True if @var{file} exists and has a size greater than zero.
8731 @item @var{file1} -nt @var{file2}
8733 @cindex newer-than file check
8734 True if @var{file1} is newer (according to modification date) than
8735 @var{file2}, or if @var{file1} exists and @var{file2} does not.
8737 @item @var{file1} -ot @var{file2}
8739 @cindex older-than file check
8740 True if @var{file1} is older (according to modification date) than
8741 @var{file2}, or if @var{file2} exists and @var{file1} does not.
8743 @item @var{file1} -ef @var{file2}
8745 @cindex same file check
8746 @cindex hard link check
8747 True if @var{file1} and @var{file2} have the same device and inode
8748 numbers, i.e., if they are hard links to each other.
8754 @subsection String tests
8756 @cindex string tests
8758 These options test string characteristics. Strings are not quoted for
8759 @command{test}, though you may need to quote them to protect characters
8760 with special meaning to the shell, e.g., spaces.
8764 @item -z @var{string}
8766 @cindex zero-length string check
8767 True if the length of @var{string} is zero.
8769 @item -n @var{string}
8772 @cindex nonzero-length string check
8773 True if the length of @var{string} is nonzero.
8775 @item @var{string1} = @var{string2}
8777 @cindex equal string check
8778 True if the strings are equal.
8780 @item @var{string1} != @var{string2}
8782 @cindex not-equal string check
8783 True if the strings are not equal.
8789 @subsection Numeric tests
8791 @cindex numeric tests
8792 @cindex arithmetic tests
8794 Numeric relationals. The arguments must be entirely numeric (possibly
8795 negative), or the special expression @w{@code{-l @var{string}}}, which
8796 evaluates to the length of @var{string}.
8800 @item @var{arg1} -eq @var{arg2}
8801 @itemx @var{arg1} -ne @var{arg2}
8802 @itemx @var{arg1} -lt @var{arg2}
8803 @itemx @var{arg1} -le @var{arg2}
8804 @itemx @var{arg1} -gt @var{arg2}
8805 @itemx @var{arg1} -ge @var{arg2}
8812 These arithmetic binary operators return true if @var{arg1} is equal,
8813 not-equal, less-than, less-than-or-equal, greater-than, or
8814 greater-than-or-equal than @var{arg2}, respectively.
8821 test -1 -gt -2 && echo yes
8823 test -l abc -gt 1 && echo yes
8826 @error{} test: integer expression expected before -eq
8830 @node Connectives for test
8831 @subsection Connectives for @command{test}
8833 @cindex logical connectives
8834 @cindex connectives, logical
8836 The usual logical connectives.
8842 True if @var{expr} is false.
8844 @item @var{expr1} -a @var{expr2}
8846 @cindex logical and operator
8847 @cindex and operator
8848 True if both @var{expr1} and @var{expr2} are true.
8850 @item @var{expr1} -o @var{expr2}
8852 @cindex logical or operator
8854 True if either @var{expr1} or @var{expr2} is true.
8859 @node expr invocation
8860 @section @command{expr}: Evaluate expressions
8863 @cindex expression evaluation
8864 @cindex evaluation of expressions
8866 @command{expr} evaluates an expression and writes the result on standard
8867 output. Each token of the expression must be a separate argument.
8869 Operands are either numbers or strings. @command{expr} converts
8870 anything appearing in an operand position to an integer or a string
8871 depending on the operation being applied to it.
8873 Strings are not quoted for @command{expr} itself, though you may need to
8874 quote them to protect characters with special meaning to the shell,
8877 @cindex parentheses for grouping
8878 Operators may be given as infix symbols or prefix keywords. Parentheses
8879 may be used for grouping in the usual manner (you must quote parentheses
8880 to avoid the shell evaluating them, however).
8882 @cindex exit status of @command{expr}
8886 0 if the expression is neither null nor 0,
8887 1 if the expression is null or 0,
8888 2 for invalid expressions.
8892 * String expressions:: + : match substr index length
8893 * Numeric expressions:: + - * / %
8894 * Relations for expr:: | & < <= = == != >= >
8895 * Examples of expr:: Examples.
8899 @node String expressions
8900 @subsection String expressions
8902 @cindex string expressions
8903 @cindex expressions, string
8905 @command{expr} supports pattern matching and other string operators. These
8906 have lower precedence than both the numeric and relational operators (in
8911 @item @var{string} : @var{regex}
8912 @cindex pattern matching
8913 @cindex regular expression matching
8914 @cindex matching patterns
8915 Perform pattern matching. The arguments are converted to strings and the
8916 second is considered to be a (basic, a la GNU @code{grep}) regular
8917 expression, with a @code{^} implicitly prepended. The first argument is
8918 then matched against this regular expression.
8920 If the match succeeds and @var{regex} uses @samp{\(} and @samp{\)}, the
8921 @code{:} expression returns the part of @var{string} that matched the
8922 subexpression; otherwise, it returns the number of characters matched.
8924 If the match fails, the @code{:} operator returns the null string if
8925 @samp{\(} and @samp{\)} are used in @var{regex}, otherwise 0.
8927 @kindex \( @r{regexp operator}
8928 Only the first @samp{\( @dots{} \)} pair is relevant to the return
8929 value; additional pairs are meaningful only for grouping the regular
8930 expression operators.
8932 @kindex \+ @r{regexp operator}
8933 @kindex \? @r{regexp operator}
8934 @kindex \| @r{regexp operator}
8935 In the regular expression, @code{\+}, @code{\?}, and @code{\|} are
8936 operators which respectively match one or more, zero or one, or separate
8937 alternatives. SunOS and other @command{expr}'s treat these as regular
8938 characters. (@acronym{POSIX} allows either behavior.)
8939 @xref{Top, , Regular Expression Library, regex, Regex}, for details of
8940 regular expression syntax. Some examples are in @ref{Examples of expr}.
8942 @item match @var{string} @var{regex}
8944 An alternative way to do pattern matching. This is the same as
8945 @w{@samp{@var{string} : @var{regex}}}.
8947 @item substr @var{string} @var{position} @var{length}
8949 Returns the substring of @var{string} beginning at @var{position}
8950 with length at most @var{length}. If either @var{position} or
8951 @var{length} is negative, zero, or non-numeric, returns the null string.
8953 @item index @var{string} @var{charset}
8955 Returns the first position in @var{string} where the first character in
8956 @var{charset} was found. If no character in @var{charset} is found in
8957 @var{string}, return 0.
8959 @item length @var{string}
8961 Returns the length of @var{string}.
8965 Interpret @var{token} as a string, even if it is a keyword like @var{match}
8966 or an operator like @code{/}.
8967 This makes it possible to test @code{expr length + "$x"} or
8968 @code{expr + "$x" : '.*/\(.\)'} and have it do the right thing even if
8969 the value of @var{$x} happens to be (for example) @code{/} or @code{index}.
8970 This operator is a GNU extension. Portable shell scripts should use
8971 @code{@w{" $token"} : @w{' \(.*\)'}} instead of @code{+ "$token"}.
8975 To make @command{expr} interpret keywords as strings, you must use the
8976 @code{quote} operator.
8979 @node Numeric expressions
8980 @subsection Numeric expressions
8982 @cindex numeric expressions
8983 @cindex expressions, numeric
8985 @command{expr} supports the usual numeric operators, in order of increasing
8986 precedence. The string operators (previous section) have lower precedence,
8987 the connectives (next section) have higher.
8996 Addition and subtraction. Both arguments are converted to numbers;
8997 an error occurs if this cannot be done.
9003 @cindex multiplication
9006 Multiplication, division, remainder. Both arguments are converted to
9007 numbers; an error occurs if this cannot be done.
9012 @node Relations for expr
9013 @subsection Relations for @command{expr}
9015 @cindex connectives, logical
9016 @cindex logical connectives
9017 @cindex relations, numeric or string
9019 @command{expr} supports the usual logical connectives and relations. These
9020 are higher precedence than either the string or numeric operators
9021 (previous sections). Here is the list, lowest-precedence operator first.
9027 @cindex logical or operator
9029 Returns its first argument if that is neither null nor 0, otherwise its
9034 @cindex logical and operator
9035 @cindex and operator
9036 Return its first argument if neither argument is null or 0, otherwise
9039 @item < <= = == != >= >
9046 @cindex comparison operators
9048 Compare the arguments and return 1 if the relation is true, 0 otherwise.
9049 @code{==} is a synonym for @code{=}. @command{expr} first tries to convert
9050 both arguments to numbers and do a numeric comparison; if either
9051 conversion fails, it does a lexicographic comparison using the character
9052 collating sequence specified by the @env{LC_COLLATE} locale.
9057 @node Examples of expr
9058 @subsection Examples of using @command{expr}
9060 @cindex examples of @command{expr}
9061 Here are a few examples, including quoting for shell metacharacters.
9063 To add 1 to the shell variable @code{foo}, in Bourne-compatible shells:
9068 To print the non-directory part of the file name stored in
9069 @code{$fname}, which need not contain a @code{/}.
9071 expr $fname : '.*/\(.*\)' '|' $fname
9074 An example showing that @code{\+} is an operator:
9081 expr abc : 'a\(.\)c'
9083 expr index abcdef cz
9086 @error{} expr: syntax error
9087 expr index quote index a
9093 @chapter Redirection
9096 @cindex commands for redirection
9098 Unix shells commonly provide several forms of @dfn{redirection}---ways
9099 to change the input source or output destination of a command. But one
9100 useful redirection is performed by a separate command, not by the shell;
9101 it's described here.
9104 * tee invocation:: Redirect output to multiple files.
9108 @node tee invocation
9109 @section @command{tee}: Redirect output to multiple files
9112 @cindex pipe fitting
9113 @cindex destinations, multiple output
9114 @cindex read from stdin and write to stdout and files
9116 The @command{tee} command copies standard input to standard output and also
9117 to any files given as arguments. This is useful when you want not only
9118 to send some data down a pipe, but also to save a copy. Synopsis:
9121 tee [@var{option}]@dots{} [@var{file}]@dots{}
9124 If a file being written to does not already exist, it is created. If a
9125 file being written to already exists, the data it previously contained
9126 is overwritten unless the @code{-a} option is used.
9128 The program accepts the following options. Also see @ref{Common options}.
9135 Append standard input to the given files rather than overwriting
9139 @itemx --ignore-interrupts
9141 @opindex --ignore-interrupts
9142 Ignore interrupt signals.
9147 @node File name manipulation
9148 @chapter File name manipulation
9150 @cindex file name manipulation
9151 @cindex manipulation of file names
9152 @cindex commands for file name manipulation
9154 This section describes commands that manipulate file names.
9157 * basename invocation:: Strip directory and suffix from a file name.
9158 * dirname invocation:: Strip non-directory suffix from a file name.
9159 * pathchk invocation:: Check file name portability.
9163 @node basename invocation
9164 @section @code{basename}: Strip directory and suffix from a file name
9167 @cindex strip directory and suffix from file names
9168 @cindex directory, stripping from file names
9169 @cindex suffix, stripping from file names
9170 @cindex file names, stripping directory and suffix
9171 @cindex leading directory components, stripping
9173 @code{basename} removes any leading directory components from
9174 @var{name}. Synopsis:
9177 basename @var{name} [@var{suffix}]
9180 If @var{suffix} is specified and is identical to the end of @var{name},
9181 it is removed from @var{name} as well. @code{basename} prints the
9182 result on standard output.
9184 The only options are @option{--help} and @option{--version}. @xref{Common
9188 @node dirname invocation
9189 @section @command{dirname}: Strip non-directory suffix from a file name
9192 @cindex directory components, printing
9193 @cindex stripping non-directory suffix
9194 @cindex non-directory suffix, stripping
9196 @command{dirname} prints all but the final slash-delimited component of
9197 a string (presumably a filename). Synopsis:
9203 If @var{name} is a single component, @command{dirname} prints @samp{.}
9204 (meaning the current directory).
9206 The only options are @option{--help} and @option{--version}. @xref{Common
9210 @node pathchk invocation
9211 @section @command{pathchk}: Check file name portability
9214 @cindex file names, checking validity and portability
9215 @cindex valid file names, checking for
9216 @cindex portable file names, checking for
9218 @command{pathchk} checks portability of filenames. Synopsis:
9221 pathchk [@var{option}]@dots{} @var{name}@dots{}
9224 For each @var{name}, @command{pathchk} prints a message if any of
9225 these conditions is true:
9228 one of the existing directories in @var{name} does not have search
9229 (execute) permission,
9231 the length of @var{name} is larger than its filesystem's maximum
9234 the length of one component of @var{name}, corresponding to an
9235 existing directory name, is larger than its filesystem's maximum
9236 length for a file name component.
9239 The program accepts the following option. Also see @ref{Common options}.
9244 @itemx --portability
9246 @opindex --portability
9247 Instead of performing length checks on the underlying filesystem,
9248 test the length of each file name and its components against the
9249 @acronym{POSIX} minimum limits for portability. Also check that the file
9250 name contains no characters not in the portable file name character set.
9254 @cindex exit status of @command{pathchk}
9258 0 if all specified file names passed all of the tests,
9263 @node Working context
9264 @chapter Working context
9266 @cindex working context
9267 @cindex commands for printing the working context
9269 This section describes commands that display or alter the context in
9270 which you are working: the current directory, the terminal settings, and
9271 so forth. See also the user-related commands in the next section.
9274 * pwd invocation:: Print working directory.
9275 * stty invocation:: Print or change terminal characteristics.
9276 * printenv invocation:: Print environment variables.
9277 * tty invocation:: Print file name of terminal on standard input.
9281 @node pwd invocation
9282 @section @command{pwd}: Print working directory
9285 @cindex print name of current directory
9286 @cindex current working directory, printing
9287 @cindex working directory, printing
9289 @cindex symbolic links and @command{pwd}
9290 @command{pwd} prints the fully resolved name of the current directory.
9291 That is, all components of the printed name will be actual directory
9292 names---none will be symbolic links.
9294 @cindex conflicts with shell built-ins
9295 @cindex built-in shell commands, conflicts with
9296 Because most shells have a built-in command by the same name, using the
9297 unadorned command name in a script or interactively may get you
9298 different functionality than that described here.
9300 The only options are a lone @option{--help} or
9301 @option{--version}. @xref{Common options}.
9304 @node stty invocation
9305 @section @command{stty}: Print or change terminal characteristics
9308 @cindex change or print terminal settings
9309 @cindex terminal settings
9310 @cindex line settings of terminal
9312 @command{stty} prints or changes terminal characteristics, such as baud rate.
9316 stty [@var{option}] [@var{setting}]@dots{}
9320 If given no line settings, @command{stty} prints the baud rate, line
9321 discipline number (on systems that support it), and line settings
9322 that have been changed from the values set by @samp{stty sane}.
9323 By default, mode reading and setting are performed on the tty line
9324 connected to standard input, although this can be modified by the
9325 @option{--file} option.
9327 @command{stty} accepts many non-option arguments that change aspects of
9328 the terminal line operation, as described below.
9330 The program accepts the following options. Also see @ref{Common options}.
9337 Print all current settings in human-readable form. This option may not
9338 be used in combination with any line settings.
9340 @item -F @var{device}
9341 @itemx --file=@var{device}
9344 Set the line opened by the filename specified in @var{device} instead of
9345 the tty line connected to standard input. This option is necessary
9346 because opening a @acronym{POSIX} tty requires use of the @code{O_NONDELAY} flag to
9347 prevent a @acronym{POSIX} tty from blocking until the carrier detect line is high if
9348 the @code{clocal} flag is not set. Hence, it is not always possible
9349 to allow the shell to open the device in the traditional manner.
9355 @cindex machine-readable @command{stty} output
9356 Print all current settings in a form that can be used as an argument to
9357 another @command{stty} command to restore the current settings. This option
9358 may not be used in combination with any line settings.
9362 Many settings can be turned off by preceding them with a @samp{-}.
9363 Such arguments are marked below with ``May be negated'' in their
9364 description. The descriptions themselves refer to the positive
9365 case, that is, when @emph{not} negated (unless stated otherwise,
9368 Some settings are not available on all @acronym{POSIX} systems, since they use
9369 extensions. Such arguments are marked below with ``Non-@acronym{POSIX}'' in their
9370 description. On non-@acronym{POSIX} systems, those or other settings also may not
9371 be available, but it's not feasible to document all the variations: just
9375 * Control:: Control settings
9376 * Input:: Input settings
9377 * Output:: Output settings
9378 * Local:: Local settings
9379 * Combination:: Combination settings
9380 * Characters:: Special characters
9381 * Special:: Special settings
9386 @subsection Control settings
9388 @cindex control settings
9394 @cindex two-way parity
9395 Generate parity bit in output and expect parity bit in input.
9402 Set odd parity (even if negated). May be negated.
9409 @cindex character size
9410 @cindex eight-bit characters
9411 Set character size to 5, 6, 7, or 8 bits.
9416 Send a hangup signal when the last process closes the tty. May be
9422 Use two stop bits per character (one if negated). May be negated.
9426 Allow input to be received. May be negated.
9430 @cindex modem control
9431 Disable modem control signals. May be negated.
9435 @cindex hardware flow control
9436 @cindex flow control, hardware
9437 @cindex RTS/CTS flow control
9438 Enable RTS/CTS flow control. Non-@acronym{POSIX}. May be negated.
9443 @subsection Input settings
9445 @cindex input settings
9450 @cindex breaks, ignoring
9451 Ignore break characters. May be negated.
9455 @cindex breaks, cause interrupts
9456 Make breaks cause an interrupt signal. May be negated.
9460 @cindex parity, ignoring
9461 Ignore characters with parity errors. May be negated.
9465 @cindex parity errors, marking
9466 Mark parity errors (with a 255-0-character sequence). May be negated.
9470 Enable input parity checking. May be negated.
9474 @cindex eight-bit input
9475 Clear high (8th) bit of input characters. May be negated.
9479 @cindex newline, translating to return
9480 Translate newline to carriage return. May be negated.
9484 @cindex return, ignoring
9485 Ignore carriage return. May be negated.
9489 @cindex return, translating to newline
9490 Translate carriage return to newline. May be negated.
9494 @kindex C-s/C-q flow control
9495 @cindex XON/XOFF flow control
9496 Enable XON/XOFF flow control (that is, @kbd{CTRL-S}/@kbd{CTRL-Q}). May
9503 @cindex software flow control
9504 @cindex flow control, software
9505 Enable sending of @code{stop} character when the system input buffer
9506 is almost full, and @code{start} character when it becomes almost
9507 empty again. May be negated.
9511 @cindex uppercase, translating to lowercase
9512 Translate uppercase characters to lowercase. Non-@acronym{POSIX}. May be
9517 Allow any character to restart output (only the start character
9518 if negated). Non-@acronym{POSIX}. May be negated.
9522 @cindex beeping at input buffer full
9523 Enable beeping and not flushing input buffer if a character arrives
9524 when the input buffer is full. Non-@acronym{POSIX}. May be negated.
9529 @subsection Output settings
9531 @cindex output settings
9532 These arguments specify output-related operations.
9537 Postprocess output. May be negated.
9541 @cindex lowercase, translating to output
9542 Translate lowercase characters to uppercase. Non-@acronym{POSIX}. May be
9547 @cindex return, translating to newline
9548 Translate carriage return to newline. Non-@acronym{POSIX}. May be negated.
9552 @cindex newline, translating to crlf
9553 Translate newline to carriage return-newline. Non-@acronym{POSIX}. May be
9558 Do not print carriage returns in the first column. Non-@acronym{POSIX}.
9563 Newline performs a carriage return. Non-@acronym{POSIX}. May be negated.
9567 @cindex pad instead of timing for delaying
9568 Use fill (padding) characters instead of timing for delays. Non-@acronym{POSIX}.
9573 @cindex pad character
9574 Use delete characters for fill instead of null characters. Non-@acronym{POSIX}.
9580 Newline delay style. Non-@acronym{POSIX}.
9587 Carriage return delay style. Non-@acronym{POSIX}.
9594 Horizontal tab delay style. Non-@acronym{POSIX}.
9599 Backspace delay style. Non-@acronym{POSIX}.
9604 Vertical tab delay style. Non-@acronym{POSIX}.
9609 Form feed delay style. Non-@acronym{POSIX}.
9614 @subsection Local settings
9616 @cindex local settings
9621 Enable @code{interrupt}, @code{quit}, and @code{suspend} special
9622 characters. May be negated.
9626 Enable @code{erase}, @code{kill}, @code{werase}, and @code{rprnt}
9627 special characters. May be negated.
9631 Enable non-@acronym{POSIX} special characters. May be negated.
9635 Echo input characters. May be negated.
9641 Echo @code{erase} characters as backspace-space-backspace. May be
9646 @cindex newline echoing after @code{kill}
9647 Echo a newline after a @code{kill} character. May be negated.
9651 @cindex newline, echoing
9652 Echo newline even if not echoing other characters. May be negated.
9656 @cindex flushing, disabling
9657 Disable flushing after @code{interrupt} and @code{quit} special
9658 characters. May be negated.
9662 @cindex case translation
9663 Enable input and output of uppercase characters by preceding their
9664 lowercase equivalents with @samp{\}, when @code{icanon} is set.
9665 Non-@acronym{POSIX}. May be negated.
9669 @cindex background jobs, stopping at terminal write
9670 Stop background jobs that try to write to the terminal. Non-@acronym{POSIX}.
9677 Echo erased characters backward, between @samp{\} and @samp{/}.
9678 Non-@acronym{POSIX}. May be negated.
9684 @cindex control characters, using @samp{^@var{c}}
9685 @cindex hat notation for control characters
9686 Echo control characters in hat notation (@samp{^@var{c}}) instead
9687 of literally. Non-@acronym{POSIX}. May be negated.
9693 Echo the @code{kill} special character by erasing each character on
9694 the line as indicated by the @code{echoprt} and @code{echoe} settings,
9695 instead of by the @code{echoctl} and @code{echok} settings. Non-@acronym{POSIX}.
9701 @subsection Combination settings
9703 @cindex combination settings
9704 Combination settings:
9711 Same as @code{parenb -parodd cs7}. May be negated. If negated, same
9712 as @code{-parenb cs8}.
9716 Same as @code{parenb parodd cs7}. May be negated. If negated, same
9717 as @code{-parenb cs8}.
9721 Same as @code{-icrnl -onlcr}. May be negated. If negated, same as
9722 @code{icrnl -inlcr -igncr onlcr -ocrnl -onlret}.
9726 Reset the @code{erase} and @code{kill} special characters to their default
9732 @c This is too long to write inline.
9734 cread -ignbrk brkint -inlcr -igncr icrnl -ixoff
9735 -iuclc -ixany imaxbel opost -olcuc -ocrnl onlcr
9736 -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0
9737 ff0 isig icanon iexten echo echoe echok -echonl
9738 -noflsh -xcase -tostop -echoprt echoctl echoke
9740 @noindent and also sets all special characters to their default values.
9744 Same as @code{brkint ignpar istrip icrnl ixon opost isig icanon}, plus
9745 sets the @code{eof} and @code{eol} characters to their default values
9746 if they are the same as the @code{min} and @code{time} characters.
9747 May be negated. If negated, same as @code{raw}.
9753 -ignbrk -brkint -ignpar -parmrk -inpck -istrip
9754 -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany
9755 -imaxbel -opost -isig -icanon -xcase min 1 time 0
9757 @noindent May be negated. If negated, same as @code{cooked}.
9761 Same as @code{-icanon}. May be negated. If negated, same as
9766 @cindex eight-bit characters
9767 Same as @code{-parenb -istrip cs8}. May be negated. If negated,
9768 same as @code{parenb istrip cs7}.
9772 Same as @code{-parenb -istrip -opost cs8}. May be negated.
9773 If negated, same as @code{parenb istrip opost cs7}.
9777 Same as @code{-ixany}. Non-@acronym{POSIX}. May be negated.
9781 Same as @code{tab0}. Non-@acronym{POSIX}. May be negated. If negated, same
9788 Same as @code{xcase iuclc olcuc}. Non-@acronym{POSIX}. May be negated.
9792 Same as @code{echoe echoctl echoke}.
9796 Same as @code{echoe echoctl echoke -ixany intr ^C erase ^? kill C-u}.
9801 @subsection Special characters
9803 @cindex special characters
9804 @cindex characters, special
9806 The special characters' default values vary from system to system.
9807 They are set with the syntax @samp{name value}, where the names are
9808 listed below and the value can be given either literally, in hat
9809 notation (@samp{^@var{c}}), or as an integer which may start with
9810 @samp{0x} to indicate hexadecimal, @samp{0} to indicate octal, or
9811 any other digit to indicate decimal.
9813 @cindex disabling special characters
9814 @kindex u@r{, and disabling special characters}
9815 For GNU stty, giving a value of @code{^-} or @code{undef} disables that
9816 special character. (This is incompatible with Ultrix @command{stty},
9817 which uses a value of @samp{u} to disable a special character. GNU
9818 @command{stty} treats a value @samp{u} like any other, namely to set that
9819 special character to @key{U}.)
9825 Send an interrupt signal.
9833 Erase the last character typed.
9837 Erase the current line.
9841 Send an end of file (terminate the input).
9849 Alternate character to end the line. Non-@acronym{POSIX}.
9853 Switch to a different shell layer. Non-@acronym{POSIX}.
9857 Restart the output after stopping it.
9865 Send a terminal stop signal.
9869 Send a terminal stop signal after flushing the input. Non-@acronym{POSIX}.
9873 Redraw the current line. Non-@acronym{POSIX}.
9877 Erase the last word typed. Non-@acronym{POSIX}.
9881 Enter the next character typed literally, even if it is a special
9882 character. Non-@acronym{POSIX}.
9887 @subsection Special settings
9889 @cindex special settings
9894 Set the minimum number of characters that will satisfy a read until
9895 the time value has expired, when @code{-icanon} is set.
9899 Set the number of tenths of a second before reads time out if the minimum
9900 number of characters have not been read, when @code{-icanon} is set.
9902 @item ispeed @var{n}
9904 Set the input speed to @var{n}.
9906 @item ospeed @var{n}
9908 Set the output speed to @var{n}.
9912 Tell the tty kernel driver that the terminal has @var{n} rows. Non-@acronym{POSIX}.
9915 @itemx columns @var{n}
9918 Tell the kernel that the terminal has @var{n} columns. Non-@acronym{POSIX}.
9924 Print the number of rows and columns that the kernel thinks the
9925 terminal has. (Systems that don't support rows and columns in the kernel
9926 typically use the environment variables @env{LINES} and @env{COLUMNS}
9927 instead; however, GNU @command{stty} does not know anything about them.)
9928 Non-@acronym{POSIX}.
9932 Use line discipline @var{n}. Non-@acronym{POSIX}.
9936 Print the terminal speed.
9939 @cindex baud rate, setting
9940 @c FIXME: Is this still true that the baud rate can't be set
9941 @c higher than 38400?
9942 Set the input and output speeds to @var{n}. @var{n} can be one
9943 of: 0 50 75 110 134 134.5 150 200 300 600 1200 1800 2400 4800 9600
9944 19200 38400 @code{exta} @code{extb}. @code{exta} is the same as
9945 19200; @code{extb} is the same as 38400. 0 hangs up the line if
9946 @code{-clocal} is set.
9950 @node printenv invocation
9951 @section @command{printenv}: Print all or some environment variables
9954 @cindex printing all or some environment variables
9955 @cindex environment variables, printing
9957 @command{printenv} prints environment variable values. Synopsis:
9960 printenv [@var{option}] [@var{variable}]@dots{}
9963 If no @var{variable}s are specified, @command{printenv} prints the value of
9964 every environment variable. Otherwise, it prints the value of each
9965 @var{variable} that is set, and nothing for those that are not set.
9967 The only options are a lone @option{--help} or @option{--version}.
9968 @xref{Common options}.
9970 @cindex exit status of @command{printenv}
9974 0 if all variables specified were found
9975 1 if at least one specified variable was not found
9976 2 if a write error occurred
9980 @node tty invocation
9981 @section @command{tty}: Print file name of terminal on standard input
9984 @cindex print terminal file name
9985 @cindex terminal file name, printing
9987 @command{tty} prints the file name of the terminal connected to its standard
9988 input. It prints @samp{not a tty} if standard input is not a terminal.
9992 tty [@var{option}]@dots{}
9995 The program accepts the following option. Also see @ref{Common options}.
10005 Print nothing; only return an exit status.
10009 @cindex exit status of @command{tty}
10013 0 if standard input is a terminal
10014 1 if standard input is not a terminal
10015 2 if given incorrect arguments
10016 3 if a write error occurs
10020 @node User information
10021 @chapter User information
10023 @cindex user information, commands for
10024 @cindex commands for printing user information
10026 This section describes commands that print user-related information:
10027 logins, groups, and so forth.
10030 * id invocation:: Print real and effective uid and gid.
10031 * logname invocation:: Print current login name.
10032 * whoami invocation:: Print effective user id.
10033 * groups invocation:: Print group names a user is in.
10034 * users invocation:: Print login names of users currently logged in.
10035 * who invocation:: Print who is currently logged in.
10039 @node id invocation
10040 @section @command{id}: Print real and effective uid and gid
10043 @cindex real uid and gid, printing
10044 @cindex effective uid and gid, printing
10045 @cindex printing real and effective uid and gid
10047 @command{id} prints information about the given user, or the process
10048 running it if no user is specified. Synopsis:
10051 id [@var{option}]@dots{} [@var{username}]
10054 By default, it prints the real user id, real group id, effective user id
10055 if different from the real user id, effective group id if different from
10056 the real group id, and supplemental group ids.
10058 Each of these numeric values is preceded by an identifying string and
10059 followed by the corresponding user or group name in parentheses.
10061 The options cause @command{id} to print only part of the above information.
10062 Also see @ref{Common options}.
10069 Print only the group id.
10075 Print only the supplementary groups.
10081 Print the user or group name instead of the ID number. Requires
10082 @code{-u}, @code{-g}, or @code{-G}.
10088 Print the real, instead of effective, user or group id. Requires
10089 @code{-u}, @code{-g}, or @code{-G}.
10095 Print only the user id.
10100 @node logname invocation
10101 @section @command{logname}: Print current login name
10104 @cindex printing user's login name
10105 @cindex login name, printing
10106 @cindex user name, printing
10111 @command{logname} prints the calling user's name, as found in the file
10112 @file{/etc/utmp}, and exits with a status of 0. If there is no
10113 @file{/etc/utmp} entry for the calling process, @command{logname} prints
10114 an error message and exits with a status of 1.
10116 The only options are @option{--help} and @option{--version}. @xref{Common
10120 @node whoami invocation
10121 @section @command{whoami}: Print effective user id
10124 @cindex effective UID, printing
10125 @cindex printing the effective UID
10127 @command{whoami} prints the user name associated with the current
10128 effective user id. It is equivalent to the command @samp{id -un}.
10130 The only options are @option{--help} and @option{--version}. @xref{Common
10134 @node groups invocation
10135 @section @command{groups}: Print group names a user is in
10138 @cindex printing groups a user is in
10139 @cindex supplementary groups, printing
10141 @command{groups} prints the names of the primary and any supplementary
10142 groups for each given @var{username}, or the current process if no names
10143 are given. If names are given, the name of each user is printed before
10144 the list of that user's groups. Synopsis:
10147 groups [@var{username}]@dots{}
10150 The group lists are equivalent to the output of the command @samp{id -Gn}.
10152 The only options are @option{--help} and @option{--version}. @xref{Common
10156 @node users invocation
10157 @section @command{users}: Print login names of users currently logged in
10160 @cindex printing current usernames
10161 @cindex usernames, printing current
10163 @cindex login sessions, printing users with
10164 @command{users} prints on a single line a blank-separated list of user
10165 names of users currently logged in to the current host. Each user name
10166 corresponds to a login session, so if a user has more than one login
10167 session, that user's name will appear the same number of times in the
10176 With no @var{file} argument, @command{users} extracts its information from
10177 the file @file{/etc/utmp}. If a file argument is given, @command{users}
10178 uses that file instead. A common choice is @file{/etc/wtmp}.
10180 The only options are @option{--help} and @option{--version}. @xref{Common
10184 @node who invocation
10185 @section @command{who}: Print who is currently logged in
10188 @cindex printing current user information
10189 @cindex information, about current users
10191 @command{who} prints information about users who are currently logged on.
10195 @command{who} [@var{option}] [@var{file}] [am i]
10198 @cindex terminal lines, currently used
10200 @cindex remote hostname
10201 If given no non-option arguments, @command{who} prints the following
10202 information for each user currently logged on: login name, terminal
10203 line, login time, and remote hostname or X display.
10207 If given one non-option argument, @command{who} uses that instead of
10208 @file{/etc/utmp} as the name of the file containing the record of
10209 users logged on. @file{/etc/wtmp} is commonly given as an argument
10210 to @command{who} to look at who has previously logged on.
10214 If given two non-option arguments, @command{who} prints only the entry
10215 for the user running it (determined from its standard input), preceded
10216 by the hostname. Traditionally, the two arguments given are @samp{am
10217 i}, as in @samp{who am i}.
10219 The program accepts the following options. Also see @ref{Common options}.
10224 Same as @samp{who am i}.
10230 Print only the login names and the number of users logged on.
10231 Overrides all other options.
10235 Ignored; for compatibility with other versions of @command{who}.
10244 After the login time, print the number of hours and minutes that the
10245 user has been idle. @samp{.} means the user was active in last minute.
10246 @samp{old} means the user was idle for more than 24 hours.
10252 Attempt to canonicalize hostnames found in utmp through a DNS lookup. This
10253 is not the default because it can cause significant delays on systems with
10254 automatic dial-up internet access.
10260 Print a line of column headings.
10271 @opindex --writable
10272 @cindex message status
10273 @pindex write@r{, allowed}
10274 After each login name print a character indicating the user's message status:
10277 @samp{+} allowing @code{write} messages
10278 @samp{-} disallowing @code{write} messages
10279 @samp{?} cannot find terminal device
10285 @node System context
10286 @chapter System context
10288 @cindex system context
10289 @cindex context, system
10290 @cindex commands for system context
10292 This section describes commands that print or change system-wide
10296 * date invocation:: Print or set system date and time.
10297 * uname invocation:: Print system information.
10298 * hostname invocation:: Print or set system name.
10299 * hostid invocation:: Print numeric host identifier.
10303 @node date invocation
10304 @section @command{date}: Print or set system date and time
10307 @cindex time, printing or setting
10308 @cindex printing the current time
10313 date [@var{option}]@dots{} [+@var{format}]
10314 date [-u|--utc|--universal] @c this avoids a newline in the output
10315 [ MMDDhhmm[[CC]YY][.ss] ]
10318 Invoking @command{date} with no @var{format} argument is equivalent to invoking
10319 @samp{date '+%a %b %e %H:%M:%S %Z %Y'}.
10321 @findex strftime @r{and @command{date}}
10322 @cindex time formats
10323 @cindex formatting times
10324 If given an argument that starts with a @samp{+}, @command{date} prints the
10325 current time and date (or the time and date specified by the
10326 @code{--date} option, see below) in the format defined by that argument,
10327 which is the same as in the @code{strftime} function. Except for
10328 directives, which start with @samp{%}, characters in the format string
10329 are printed unchanged. The directives are described below.
10332 * Time directives:: %[HIklMprsSTXzZ]
10333 * Date directives:: %[aAbBcdDhjmUwWxyY]
10334 * Literal directives:: %[%nt]
10335 * Padding:: Pad with zeroes, spaces (%_), or nothing (%-).
10336 * Setting the time:: Changing the system clock.
10337 * Options for date:: Instead of the current time.
10338 * Examples of date:: Examples.
10341 @node Time directives
10342 @subsection Time directives
10344 @cindex time directives
10345 @cindex directives, time
10347 @command{date} directives related to times.
10359 minute (00@dots{}59)
10361 nanoseconds (000000000@dots{}999999999)
10363 locale's upper case @samp{AM} or @samp{PM} (blank in many locales)
10365 locale's lower case @samp{am} or @samp{pm} (blank in many locales)
10367 time, 12-hour (hh:mm:ss [AP]M)
10369 time, 24-hour (hh:mm). Same as @code{%H:%M}.
10371 @cindex epoch, seconds since
10372 @cindex seconds since the epoch
10373 @cindex beginning of time
10374 seconds since the epoch, i.e., 1 January 1970 00:00:00 UTC (a
10376 Note that this value is the number of seconds between the epoch
10377 and the current date as defined by the localtime system call.
10378 It isn't changed by the @option{--date} option.
10380 second (00@dots{}60). The range is [00@dots{}60], and not [00@dots{}59],
10381 in order to accommodate the occasional positive leap second.
10383 time, 24-hour (hh:mm:ss)
10385 locale's time representation (%H:%M:%S)
10387 RFC-822 style numeric time zone (e.g., -0600 or +0100), or nothing if no
10388 time zone is determinable. This value reflects the @emph{current} time
10389 zone. It isn't changed by the @option{--date} option.
10391 time zone (e.g., EDT), or nothing if no time zone is
10393 Note that this value reflects the @emph{current} time zone.
10394 It isn't changed by the @option{--date} option.
10398 @node Date directives
10399 @subsection Date directives
10401 @cindex date directives
10402 @cindex directives, date
10404 @command{date} directives related to dates.
10408 locale's abbreviated weekday name (Sun@dots{}Sat)
10410 locale's full weekday name, variable length (Sunday@dots{}Saturday)
10412 locale's abbreviated month name (Jan@dots{}Dec)
10414 locale's full month name, variable length (January@dots{}December)
10416 locale's date and time (Sat Nov 04 12:02:33 EST 1989)
10418 century (year divided by 100 and truncated to an integer) (00@dots{}99)
10420 day of month (01@dots{}31)
10424 blank-padded day of month (1@dots{}31)
10426 the @w{ISO 8601} standard date format: @code{%Y-%m-%d}.
10427 This is the preferred form for all uses.
10429 The year corresponding to the ISO week number, but without the century
10430 (range @code{00} through @code{99}). This has the same format and value
10431 as @code{%y}, except that if the ISO week number (see @code{%V}) belongs
10432 to the previous or next year, that year is used instead.
10434 The year corresponding to the ISO week number. This has the same format
10435 and value as @code{%Y}, except that if the ISO week number (see
10436 @code{%V}) belongs to the previous or next year, that year is used
10441 day of year (001@dots{}366)
10443 month (01@dots{}12)
10445 day of week (1@dots{}7) with 1 corresponding to Monday
10447 week number of year with Sunday as first day of week (00@dots{}53).
10448 Days in a new year preceding the first Sunday are in week zero.
10450 week number of year with Monday as first day of the week as a decimal
10451 (01@dots{}53). If the week containing January 1 has four or more days in
10452 the new year, then it is considered week 1; otherwise, it is week 53 of
10453 the previous year, and the next week is week 1. (See the @acronym{ISO} 8601
10456 day of week (0@dots{}6) with 0 corresponding to Sunday
10458 week number of year with Monday as first day of week (00@dots{}53).
10459 Days in a new year preceding the first Monday are in week zero.
10461 locale's date representation (mm/dd/yy)
10463 last two digits of year (00@dots{}99)
10465 year (1970@dots{}.)
10469 @node Literal directives
10470 @subsection Literal directives
10472 @cindex literal directives
10473 @cindex directives, literal
10475 @command{date} directives that produce literal strings.
10488 @subsection Padding
10490 @cindex numeric field padding
10491 @cindex padding of numeric fields
10492 @cindex fields, padding numeric
10494 By default, @command{date} pads numeric fields with zeroes, so that, for
10495 example, numeric months are always output as two digits. GNU @command{date}
10496 recognizes the following numeric modifiers between the @samp{%} and the
10501 (hyphen) do not pad the field; useful if the output is intended for
10504 (underscore) pad the field with spaces; useful if you need a fixed
10505 number of characters in the output, but zeroes are too distracting.
10509 These are GNU extensions.
10511 Here is an example illustrating the differences:
10514 date +%d/%m -d "Feb 1"
10516 date +%-d/%-m -d "Feb 1"
10518 date +%_d/%_m -d "Feb 1"
10523 @node Setting the time
10524 @subsection Setting the time
10526 @cindex setting the time
10527 @cindex time setting
10528 @cindex appropriate privileges
10530 If given an argument that does not start with @samp{+}, @command{date} sets
10531 the system clock to the time and date specified by that argument (as
10532 described below). You must have appropriate privileges to set the
10533 system clock. The @option{--date} and @option{--set} options may not be
10534 used with such an argument. The @option{--universal} option may be used
10535 with such an argument to indicate that the specified time and date are
10536 relative to Coordinated Universal Time rather than to the local time
10539 The argument must consist entirely of digits, which have the following
10552 first two digits of year (optional)
10554 last two digits of year (optional)
10559 The @option{--set} option also sets the system clock; see the next section.
10562 @node Options for date
10563 @subsection Options for @command{date}
10565 @cindex @command{date} options
10566 @cindex options for @command{date}
10568 The program accepts the following options. Also see @ref{Common options}.
10572 @item -d @var{datestr}
10573 @itemx --date=@var{datestr}
10576 @cindex parsing date strings
10577 @cindex date strings, parsing
10578 @cindex arbitrary date strings, parsing
10581 @opindex next @var{day}
10582 @opindex last @var{day}
10583 Display the time and date specified in @var{datestr} instead of the
10584 current time and date. @var{datestr} can be in almost any common
10585 format. It can contain month names, time zones, @samp{am} and @samp{pm},
10586 @samp{yesterday}, @samp{ago}, @samp{next}, etc. @xref{Date input formats}.
10588 @item -f @var{datefile}
10589 @itemx --file=@var{datefile}
10592 Parse each line in @var{datefile} as with @option{-d} and display the
10593 resulting time and date. If @var{datefile} is @samp{-}, use standard
10594 input. This is useful when you have many dates to process, because the
10595 system overhead of starting up the @command{date} executable many times can
10598 @item -I @var{timespec}
10599 @itemx --iso-8601[=@var{timespec}]
10600 @opindex -I @var{timespec}
10601 @opindex --iso-8601[=@var{timespec}]
10602 Display the date using the @acronym{ISO} 8601 format, @samp{%Y-%m-%d}.
10604 The argument @var{timespec} specifies the number of additional
10605 terms of the time to include. It can be one of the following:
10608 The default behavior: print just the date.
10611 Append the hour of the day to the date.
10614 Append the hours and minutes.
10617 Append the hours, minutes, and seconds.
10620 If showing any time terms, then include the time zone using the format
10623 If @var{timespec} is omitted with @option{--iso-8601}, the default is
10624 @samp{auto}. On older systems, @sc{gnu} @command{date} instead
10625 supports an obsolete option @option{-I[@var{timespec}]}, where
10626 @var{timespec} defaults to @samp{auto}. @acronym{POSIX} 1003.1-2001
10627 (@pxref{Standards conformance}) does not allow @option{-I} without an
10628 argument; use @option{--iso-8601} instead.
10634 Display the time and date using the RFC-822-conforming
10635 format, @samp{%a, %_d %b %Y %H:%M:%S %z}.
10637 @item -r @var{file}
10638 @itemx --reference=@var{file}
10640 @opindex --reference
10641 Display the time and date reference according to the last modification
10642 time of @var{file}, instead of the current time and date.
10644 @item -s @var{datestr}
10645 @itemx --set=@var{datestr}
10648 Set the time and date to @var{datestr}. See @option{-d} above.
10655 @opindex --universal
10656 @cindex Coordinated Universal Time
10658 @cindex Greenwich Mean Time
10660 Use Coordinated Universal Time (@acronym{UTC}) by operating as if the
10661 @env{TZ} environment variable were set to the string @samp{UTC0}.
10662 Normally, @command{date} operates in the time zone indicated by
10663 @env{TZ}, or the system default if @env{TZ} is not set. Coordinated
10664 Universal Time is often called ``Greenwich Mean Time'' (@sc{gmt}) for
10665 historical reasons.
10669 @node Examples of date
10670 @subsection Examples of @command{date}
10672 @cindex examples of @command{date}
10674 Here are a few examples. Also see the documentation for the @option{-d}
10675 option in the previous section.
10680 To print the date of the day before yesterday:
10683 date --date='2 days ago'
10687 To print the date of the day three months and one day hence:
10689 date --date='3 months 1 day'
10693 To print the day of year of Christmas in the current year:
10695 date --date='25 Dec' +%j
10699 To print the current full month name and the day of the month:
10704 But this may not be what you want because for the first nine days of
10705 the month, the @samp{%d} expands to a zero-padded two-digit field,
10706 for example @samp{date -d 1may '+%B %d'} will print @samp{May 01}.
10709 To print a date without the leading zero for one-digit days
10710 of the month, you can use the (GNU extension) @code{-} modifier to suppress
10711 the padding altogether.
10713 date -d 1may '+%B %-d
10717 To print the current date and time in the format required by many
10718 non-GNU versions of @command{date} when setting the system clock:
10720 date +%m%d%H%M%Y.%S
10724 To set the system clock forward by two minutes:
10726 date --set='+2 minutes'
10730 To print the date in the format specified by RFC-822,
10731 use @samp{date --rfc}. I just did and saw this:
10734 Mon, 25 Mar 1996 23:34:17 -0600
10738 To convert a date string to the number of seconds since the epoch
10739 (which is 1970-01-01 00:00:00 UTC), use the @option{--date} option with
10740 the @samp{%s} format. That can be useful in sorting and/or graphing
10741 and/or comparing data by date. The following command outputs the
10742 number of the seconds since the epoch for the time two minutes after the
10746 date --date='1970-01-01 00:02:00 +0000' +%s
10750 If you do not specify time zone information in the date string,
10751 @command{date} uses your computer's idea of the time zone when
10752 interpreting the string. For example, if your computer's time zone is
10753 that of Cambridge, Massachusetts, which was then 5 hours (i.e., 18,000
10754 seconds) behind UTC:
10757 # local time zone used
10758 date --date='1970-01-01 00:02:00' +%s
10763 If you're sorting or graphing dated data, your raw date values may be
10764 represented as seconds since the epoch. But few people can look at
10765 the date @samp{946684800} and casually note ``Oh, that's the first second
10766 of the year 2000 in Greenwich, England.''
10769 date --date='2000-01-01 UTC' +%s
10773 To convert such an unwieldy number of seconds back to
10774 a more readable form, use a command like this:
10777 # local time zone used
10778 date -d '1970-01-01 UTC 946684800 seconds' +"%Y-%m-%d %T %z"
10779 1999-12-31 19:00:00 -0500
10785 @node uname invocation
10786 @section @command{uname}: Print system information
10789 @cindex print system information
10790 @cindex system information, printing
10792 @command{uname} prints information about the machine and operating system
10793 it is run on. If no options are given, @command{uname} acts as if the
10794 @code{-s} option were given. Synopsis:
10797 uname [@var{option}]@dots{}
10800 If multiple options or @code{-a} are given, the selected information is
10801 printed in this order:
10804 @var{kernel-name} @var{nodename} @var{kernel-release} @var{kernel-version} @var{machine} @var{processor} @var{hardware-platform} @var{operating-system}
10807 The information may contain internal spaces, so such output cannot be
10808 parsed reliably. In the following example, @var{release} is
10809 @samp{2.2.18ss.e820-bda652a #4 SMP Tue Jun 5 11:24:08 PDT 2001}:
10813 @result{} Linux dum 2.2.18ss.e820-bda652a #4 SMP Tue Jun 5 11:24:08 PDT 2001 i686 unknown unknown GNU/Linux
10817 The program accepts the following options. Also see @ref{Common options}.
10825 Print all of the below information.
10828 @itemx --hardware-platform
10830 @opindex --hardware-platform
10831 @cindex implementation, hardware
10832 @cindex hardware platform
10833 @cindex platform, hardware
10834 Print the hardware platform name
10835 (sometimes called the hardware implementation).
10841 @cindex machine type
10842 @cindex hardware class
10843 @cindex hardware type
10844 Print the machine hardware name (sometimes called the hardware class).
10849 @opindex --nodename
10852 @cindex network node name
10853 Print the network node hostname.
10858 @opindex --processor
10859 @cindex host processor type
10860 Print the processor type (sometimes called the instruction set
10861 architecture or ISA).
10864 @itemx --operating-system
10866 @opindex --operating-system
10867 @cindex operating system name
10868 Print the name of the operating system.
10871 @itemx --kernel-release
10873 @opindex --kernel-release
10874 @cindex kernel release
10875 @cindex release of kernel
10876 Print the kernel release.
10879 @itemx --kernel-name
10881 @opindex --kernel-name
10882 @cindex kernel name
10883 @cindex name of kernel
10884 Print the kernel name.
10887 @itemx --kernel-version
10889 @opindex --kernel-version
10890 @cindex kernel version
10891 @cindex version of kernel
10892 Print the kernel version.
10896 @node hostname invocation
10897 @section @command{hostname}: Print or set system name
10900 @cindex setting the hostname
10901 @cindex printing the hostname
10902 @cindex system name, printing
10903 @cindex appropriate privileges
10905 With no arguments, @command{hostname} prints the name of the current host
10906 system. With one argument, it sets the current host name to the
10907 specified string. You must have appropriate privileges to set the host
10911 hostname [@var{name}]
10914 The only options are @option{--help} and @option{--version}. @xref{Common
10918 @node hostid invocation
10919 @section @command{hostid}: Print numeric host identifier.
10922 @cindex printing the host identifier
10924 @command{hostid} prints the numeric identifier of the current host
10925 in hexadecimal. This command accepts no arguments.
10926 The only options are @option{--help} and @option{--version}.
10927 @xref{Common options}.
10929 For example, here's what it prints on one system I use:
10936 On that system, the 32-bit quantity happens to be closely
10937 related to the system's Internet address, but that isn't always
10941 @node Modified command invocation
10942 @chapter Modified command invocation
10944 @cindex modified command invocation
10945 @cindex invocation of commands, modified
10946 @cindex commands for invoking other commands
10948 This section describes commands that run other commands in some context
10949 different than the current one: a modified environment, as a different
10953 * chroot invocation:: Modify the root directory.
10954 * env invocation:: Modify environment variables.
10955 * nice invocation:: Modify scheduling priority.
10956 * nohup invocation:: Immunize to hangups.
10957 * su invocation:: Modify user and group id.
10961 @node chroot invocation
10962 @section @command{chroot}: Run a command with a different root directory
10965 @cindex running a program in a specified root directory
10966 @cindex root directory, running a program in a specified
10968 @command{chroot} runs a command with a specified root directory.
10969 On many systems, only the super-user can do this.
10973 chroot @var{newroot} [@var{command} [@var{args}]@dots{}]
10974 chroot @var{option}
10977 Ordinarily, filenames are looked up starting at the root of the
10978 directory structure, i.e., @file{/}. @command{chroot} changes the root to
10979 the directory @var{newroot} (which must exist) and then runs
10980 @var{command} with optional @var{args}. If @var{command} is not
10981 specified, the default is the value of the @env{SHELL} environment
10982 variable or @code{/bin/sh} if not set, invoked with the @option{-i} option.
10984 The only options are @option{--help} and @option{--version}. @xref{Common
10987 Here are a few tips to help avoid common problems in using chroot.
10988 To start with a simple example, make @var{command} refer to a statically
10989 linked binary. If you were to use a dynamically linked executable, then
10990 you'd have to arrange to have the shared libraries in the right place under
10991 your new root directory.
10993 For example, if you create a statically linked `ls' executable,
10994 and put it in /tmp/empty, you can run this command as root:
10997 $ chroot /tmp/empty /ls -Rl /
11000 Then you'll see output like this:
11005 -rwxr-xr-x 1 0 0 1041745 Aug 16 11:17 ls
11008 If you want to use a dynamically linked executable, say @code{bash},
11009 then first run @samp{ldd bash} to see what shared objects it needs.
11010 Then, in addition to copying the actual binary, also copy the listed
11011 files to the required positions under your intended new root directory.
11012 Finally, if the executable requires any other files (e.g., data, state,
11013 device files), copy them into place, too.
11016 @node env invocation
11017 @section @command{env}: Run a command in a modified environment
11020 @cindex environment, running a program in a modified
11021 @cindex modified environment, running a program in a
11022 @cindex running a program in a modified environment
11024 @command{env} runs a command with a modified environment. Synopses:
11027 env [@var{option}]@dots{} [@var{name}=@var{value}]@dots{} @c
11028 [@var{command} [@var{args}]@dots{}]
11032 Arguments of the form @samp{@var{variable}=@var{value}} set
11033 the environment variable @var{variable} to value @var{value}.
11034 @var{value} may be empty (@samp{@var{variable}=}). Setting a variable
11035 to an empty value is different from unsetting it.
11038 The first remaining argument specifies the program name to invoke; it is
11039 searched for according to the @env{PATH} environment variable. Any
11040 remaining arguments are passed as arguments to that program.
11042 @cindex environment, printing
11044 If no command name is specified following the environment
11045 specifications, the resulting environment is printed. This is like
11046 specifying a command name of @command{printenv}.
11048 The program accepts the following options. Also see @ref{Common options}.
11052 @item -u @var{name}
11053 @itemx --unset=@var{name}
11056 Remove variable @var{name} from the environment, if it was in the
11061 @itemx --ignore-environment
11064 @opindex --ignore-environment
11065 Start with an empty environment, ignoring the inherited environment.
11070 @node nice invocation
11071 @section @command{nice}: Run a command with modified scheduling priority
11074 @cindex modifying scheduling priority
11075 @cindex scheduling priority, modifying
11076 @cindex priority, modifying
11077 @cindex appropriate privileges
11079 @command{nice} prints or modifies the scheduling priority of a job.
11083 nice [@var{option}]@dots{} [@var{command} [@var{arg}]@dots{}]
11086 If no arguments are given, @command{nice} prints the current scheduling
11087 priority, which it inherited. Otherwise, @command{nice} runs the given
11088 @var{command} with its scheduling priority adjusted. If no
11089 @var{adjustment} is given, the priority of the command is incremented by
11090 10. You must have appropriate privileges to specify a negative
11091 adjustment. The priority can be adjusted by @command{nice} over the range
11092 of -20 (the highest priority) to 19 (the lowest).
11094 @cindex conflicts with shell built-ins
11095 @cindex built-in shell commands, conflicts with
11096 Because most shells have a built-in command by the same name, using the
11097 unadorned command name in a script or interactively may get you
11098 different functionality than that described here.
11100 The program accepts the following option. Also see @ref{Common options}.
11103 @item -n @var{adjustment}
11104 @itemx --adjustment=@var{adjustment}
11106 @opindex --adjustment
11107 Add @var{adjustment} instead of 10 to the command's priority.
11109 On older systems, @command{nice} supports an obsolete option
11110 @option{-@var{adjustment}}. @acronym{POSIX} 1003.1-2001 (@pxref{Standards
11111 conformance}) does not allow this; use @option{-n @var{adjustment}}
11117 @node nohup invocation
11118 @section @command{nohup}: Run a command immune to hangups
11121 @cindex hangups, immunity to
11122 @cindex immunity to hangups
11123 @cindex logging out and continuing to run
11126 @command{nohup} runs the given @var{command} with hangup signals ignored,
11127 so that the command can continue running in the background after you log
11131 nohup @var{command} [@var{arg}]@dots{}
11135 If standard output is a terminal, it is redirected so that it is appended
11136 to the file @file{nohup.out}; if that cannot be written to, it is appended
11137 to the file @file{$HOME/nohup.out}. If that cannot be written to, the
11138 command is not run.
11140 If @command{nohup} creates either @file{nohup.out} or
11141 @file{$HOME/nohup.out}, it creates it with no ``group'' or ``other''
11142 access permissions. It does not change the permissions if the output
11143 file already existed.
11145 If standard error is a terminal, it is redirected to the same file
11146 descriptor as the standard output.
11148 @command{nohup} does not automatically put the command it runs in the
11149 background; you must do that explicitly, by ending the command line
11150 with an @samp{&}. Also, @command{nohup} does not change the
11151 scheduling priority of @var{command}; use @command{nice} for that,
11152 e.g., @samp{nohup nice @var{command}}.
11154 The only options are @option{--help} and @option{--version}. @xref{Common
11157 @cindex exit status of @command{nohup}
11161 126 if @var{command} was found but could not be invoked
11162 127 if @command{nohup} itself failed or if @var{command} could not be found
11163 the exit status of @var{command} otherwise
11167 @node su invocation
11168 @section @command{su}: Run a command with substitute user and group id
11171 @cindex substitute user and group ids
11172 @cindex user id, switching
11173 @cindex super-user, becoming
11174 @cindex root, becoming
11176 @command{su} allows one user to temporarily become another user. It runs a
11177 command (often an interactive shell) with the real and effective user
11178 id, group id, and supplemental groups of a given @var{user}. Synopsis:
11181 su [@var{option}]@dots{} [@var{user} [@var{arg}]@dots{}]
11184 @cindex passwd entry, and @command{su} shell
11186 @flindex /etc/passwd
11187 If no @var{user} is given, the default is @code{root}, the super-user.
11188 The shell to use is taken from @var{user}'s @code{passwd} entry, or
11189 @file{/bin/sh} if none is specified there. If @var{user} has a
11190 password, @command{su} prompts for the password unless run by a user with
11191 effective user id of zero (the super-user).
11197 @cindex login shell
11198 By default, @command{su} does not change the current directory.
11199 It sets the environment variables @env{HOME} and @env{SHELL}
11200 from the password entry for @var{user}, and if @var{user} is not
11201 the super-user, sets @env{USER} and @env{LOGNAME} to @var{user}.
11202 By default, the shell is not a login shell.
11204 Any additional @var{arg}s are passed as additional arguments to the
11207 @cindex @option{-su}
11208 GNU @command{su} does not treat @file{/bin/sh} or any other shells specially
11209 (e.g., by setting @code{argv[0]} to @option{-su}, passing @code{-c} only
11210 to certain shells, etc.).
11213 @command{su} can optionally be compiled to use @code{syslog} to report
11214 failed, and optionally successful, @command{su} attempts. (If the system
11215 supports @code{syslog}.) However, GNU @command{su} does not check if the
11216 user is a member of the @code{wheel} group; see below.
11218 The program accepts the following options. Also see @ref{Common options}.
11221 @item -c @var{command}
11222 @itemx --command=@var{command}
11225 Pass @var{command}, a single command line to run, to the shell with
11226 a @code{-c} option instead of starting an interactive shell.
11233 @cindex file name pattern expansion, disabled
11234 @cindex globbing, disabled
11235 Pass the @code{-f} option to the shell. This probably only makes sense
11236 if the shell run is @code{csh} or @code{tcsh}, for which the @code{-f}
11237 option prevents reading the startup file (@file{.cshrc}). With
11238 Bourne-like shells, the @code{-f} option disables file name pattern
11239 expansion (globbing), which is not likely to be useful.
11247 @c other variables already indexed above
11250 @cindex login shell, creating
11251 Make the shell a login shell. This means the following. Unset all
11252 environment variables except @env{TERM}, @env{HOME}, and @env{SHELL}
11253 (which are set as described above), and @env{USER} and @env{LOGNAME}
11254 (which are set, even for the super-user, as described above), and set
11255 @env{PATH} to a compiled-in default value. Change to @var{user}'s home
11256 directory. Prepend @samp{-} to the shell's name, intended to make it
11257 read its login startup file(s).
11261 @itemx --preserve-environment
11264 @opindex --preserve-environment
11265 @cindex environment, preserving
11266 @flindex /etc/shells
11267 @cindex restricted shell
11268 Do not change the environment variables @env{HOME}, @env{USER},
11269 @env{LOGNAME}, or @env{SHELL}. Run the shell given in the environment
11270 variable @env{SHELL} instead of the shell from @var{user}'s passwd
11271 entry, unless the user running @command{su} is not the superuser and
11272 @var{user}'s shell is restricted. A @dfn{restricted shell} is one that
11273 is not listed in the file @file{/etc/shells}, or in a compiled-in list
11274 if that file does not exist. Parts of what this option does can be
11275 overridden by @code{--login} and @code{--shell}.
11277 @item -s @var{shell}
11278 @itemx --shell=@var{shell}
11281 Run @var{shell} instead of the shell from @var{user}'s passwd entry,
11282 unless the user running @command{su} is not the superuser and @var{user}'s
11283 shell is restricted (see @option{-m} just above).
11287 @cindex wheel group, not supported
11288 @cindex group wheel, not supported
11290 @heading Why GNU @command{su} does not support the @samp{wheel} group
11292 (This section is by Richard Stallman.)
11296 Sometimes a few of the users try to hold total power over all the
11297 rest. For example, in 1984, a few users at the MIT AI lab decided to
11298 seize power by changing the operator password on the Twenex system and
11299 keeping it secret from everyone else. (I was able to thwart this coup
11300 and give power back to the users by patching the kernel, but I
11301 wouldn't know how to do that in Unix.)
11303 However, occasionally the rulers do tell someone. Under the usual
11304 @command{su} mechanism, once someone learns the root password who
11305 sympathizes with the ordinary users, he or she can tell the rest. The
11306 ``wheel group'' feature would make this impossible, and thus cement the
11307 power of the rulers.
11309 I'm on the side of the masses, not that of the rulers. If you are
11310 used to supporting the bosses and sysadmins in whatever they do, you
11311 might find this idea strange at first.
11314 @node Process control
11315 @chapter Process control
11317 @cindex processes, commands for controlling
11318 @cindex commands for controlling processes
11321 * kill invocation:: Sending a signal to processes.
11325 @node kill invocation
11326 @section @command{kill}: Send a signal to processes
11329 @cindex send a signal to processes
11331 The @command{kill} command sends a signal to processes, causing them
11332 to terminate or otherwise act upon receiving the signal in some way.
11333 Alternatively, it lists information about signals. Synopses:
11336 kill [-s @var{signal} | --signal @var{signal} | -@var{signal}] @var{pid}@dots{}
11337 kill [-l | --list | -t | --table] [@var{signal}]@dots{}
11340 The first form of the @command{kill} command sends a signal to all
11341 @var{pid} arguments. The default signal to send if none is specified
11342 is @samp{TERM}. The special signal number @samp{0} does not denote a
11343 valid signal, but can be used to test whether the @var{pid} arguments
11344 specify processes to which a signal could be sent.
11346 If @var{pid} is positive, the signal is sent to the process with the
11347 process id @var{pid}. If @var{pid} is zero, the signal is sent to all
11348 processes in the process group of the current process. If @var{pid}
11349 is -1, the signal is sent to all processes for which the user has
11350 permission to send a signal. If @var{pid} is less than -1, the signal
11351 is sent to all processes in the process group that equals the absolute
11352 value of @var{pid}.
11354 If @var{pid} is not positive, a system-dependent set of system
11355 processes is excluded from the list of processes to which the signal
11358 If a negative @var{PID} argument is desired as the first one, either a
11359 signal must be specified as well, or the option parsing
11360 must be interrupted with `--' before the first @var{pid} argument.
11361 The following three commands are equivalent:
11369 The first form of the @command{kill} command succeeds if every @var{pid}
11370 argument specifies at least one process that the signal was sent to.
11372 The second form of the @command{kill} command lists signal information.
11373 Either the @option{-l} or @option{--list} option, or the @option{-t}
11374 or @option{--table} option must be specified. Without any
11375 @var{signal} argument, all supported signals are listed. The output
11376 of @option{-l} or @option{--list} is a list of the signal names, one
11377 per line; if @var{signal} is already a name, the signal number is
11378 printed instead. The output of @option{-t} or @option{--table} is a
11379 table of signal numbers, names, and descriptions. This form of the
11380 @command{kill} command succeeds if all @var{signal} arguments are valid
11381 and if there is no output error.
11383 The @command{kill} command also supports the @option{--help} and
11384 @option{--version} options. @xref{Common options}.
11386 A @var{signal} may be a signal name like @samp{HUP}, or a signal
11387 number like @samp{1}, or an exit status of a process terminated by the
11388 signal. A signal name can be given in canonical form or prefixed by
11389 @samp{SIG}. The case of the letters is ignored, except for the
11390 @option{-@var{signal}} option which must use upper case to avoid
11391 ambiguity with lower case option letters. The following signal names
11392 and numbers are supported on all @acronym{POSIX} compliant systems:
11398 2. Terminal interrupt.
11404 9. Kill (cannot be caught or ignored).
11412 Other supported signal names have system-dependent corresponding
11413 numbers. All systems conforming to @acronym{POSIX} 1003.1-2001 also
11414 support the following signals:
11418 Access to an undefined portion of a memory object.
11420 Child process terminated, stopped, or continued.
11422 Continue executing, if stopped.
11424 Erroneous arithmetic operation.
11426 Illegal Instruction.
11428 Write on a pipe with no one to read it.
11430 Invalid memory reference.
11432 Stop executing (cannot be caught or ignored).
11436 Background process attempting read.
11438 Background process attempting write.
11440 High bandwidth data is available at a socket.
11442 User-defined signal 1.
11444 User-defined signal 2.
11448 @acronym{POSIX} 1003.1-2001 systems that support the @acronym{XSI} extension
11449 also support the following signals:
11455 Profiling timer expired.
11459 Trace/breakpoint trap.
11461 Virtual timer expired.
11463 CPU time limit exceeded.
11465 File size limit exceeded.
11469 @acronym{POSIX} 1003.1-2001 systems that support the @acronym{XRT} extension
11470 also support at least eight real-time signals called @samp{RTMIN},
11471 @samp{RTMIN+1}, @dots{}, @samp{RTMAX-1}, @samp{RTMAX}.
11477 @cindex delaying commands
11478 @cindex commands for delaying
11480 @c Perhaps @code{wait} or other commands should be described here also?
11483 * sleep invocation:: Delay for a specified time.
11487 @node sleep invocation
11488 @section @command{sleep}: Delay for a specified time
11491 @cindex delay for a specified time
11493 @command{sleep} pauses for an amount of time specified by the sum of
11494 the values of the command line arguments.
11498 sleep @var{number}[smhd]@dots{}
11502 Each argument is a number followed by an optional unit; the default
11503 is seconds. The units are:
11516 Historical implementations of @command{sleep} have required that
11517 @var{number} be an integer. However, GNU @command{sleep} accepts
11518 arbitrary floating point numbers.
11520 The only options are @option{--help} and @option{--version}. @xref{Common
11524 @node Numeric operations
11525 @chapter Numeric operations
11527 @cindex numeric operations
11528 These programs do numerically-related operations.
11531 * factor invocation:: Show factors of numbers.
11532 * seq invocation:: Print sequences of numbers.
11536 @node factor invocation
11537 @section @command{factor}: Print prime factors
11540 @cindex prime factors
11542 @command{factor} prints prime factors. Synopses:
11545 factor [@var{number}]@dots{}
11546 factor @var{option}
11549 If no @var{number} is specified on the command line, @command{factor} reads
11550 numbers from standard input, delimited by newlines, tabs, or spaces.
11552 The only options are @option{--help} and @option{--version}. @xref{Common
11555 The algorithm it uses is not very sophisticated, so for some inputs
11556 @command{factor} runs for a long time. The hardest numbers to factor are
11557 the products of large primes. Factoring the product of the two largest 32-bit
11558 prime numbers takes over 10 minutes of CPU time on a 400MHz Pentium II.
11561 $ p=`echo '4294967279 * 4294967291'|bc`
11563 18446743979220271189: 4294967279 4294967291
11566 In contrast, @command{factor} factors the largest 64-bit number in just
11567 over a tenth of a second:
11570 $ factor `echo '2^64-1'|bc`
11571 18446744073709551615: 3 5 17 257 641 65537 6700417
11574 @node seq invocation
11575 @section @command{seq}: Print numeric sequences
11578 @cindex numeric sequences
11579 @cindex sequence of numbers
11581 @command{seq} prints a sequence of numbers to standard output. Synopses:
11584 seq [@var{option}]@dots{} [@var{first} [@var{increment}]] @var{last}@dots{}
11587 @command{seq} prints the numbers from @var{first} to @var{last} by
11588 @var{increment}. By default, @var{first} and @var{increment} are both 1,
11589 and each number is printed on its own line. All numbers can be reals,
11592 The program accepts the following options. Also see @ref{Common options}.
11595 @item -f @var{format}
11596 @itemx --format=@var{format}
11597 @opindex -f @var{format}
11598 @opindex --format=@var{format}
11599 @cindex formatting of numbers in @command{seq}
11600 Print all numbers using @var{format}; default @samp{%g}.
11601 @var{format} must contain exactly one of the floating point
11602 output formats @samp{%e}, @samp{%f}, or @samp{%g}.
11604 @item -s @var{string}
11605 @itemx --separator=@var{string}
11606 @cindex separator for numbers in @command{seq}
11607 Separate numbers with @var{string}; default is a newline.
11608 The output always terminates with a newline.
11611 @itemx --equal-width
11612 Print all numbers with the same width, by padding with leading zeroes.
11613 (To have other kinds of padding, use @option{--format}).
11617 If you want to use @command{seq} to print sequences of large integer values,
11618 don't use the default @samp{%g} format since it can result in
11622 $ seq 1000000 1000001
11627 Instead, you can use the format, @samp{%1.f},
11628 to print large decimal numbers with no exponent and no decimal point.
11631 $ seq --format=%1.f 1000000 1000001
11636 If you want hexadecimal output, you can use @command{printf}
11637 to perform the conversion:
11640 $ printf %x'\n' `seq -f %1.f 1048575 1024 1050623`
11646 For very long lists of numbers, use xargs to avoid
11647 system limitations on the length of an argument list:
11650 $ seq -f %1.f 1000000 | xargs printf %x'\n' | tail -n 3
11656 To generate octal output, use the printf @code{%o} format instead
11657 of @code{%x}. Note however that using printf works only for numbers
11658 smaller than @code{2^32}:
11661 $ printf "%x\n" `seq -f %1.f 4294967295 4294967296`
11663 bash: printf: 4294967296: Numerical result out of range
11666 On most systems, seq can produce whole-number output for values up to
11667 @code{2^53}, so here's a more general approach to base conversion that
11668 also happens to be more robust for such large numbers. It works by
11669 using @code{bc} and setting its output radix variable, @var{obase},
11670 to @samp{16} in this case to produce hexadecimal output.
11673 $ (echo obase=16; seq -f %1.f 4294967295 4294967296)|bc
11678 Be careful when using @command{seq} with a fractional @var{increment},
11679 otherwise you may see surprising results. Most people would expect to
11680 see @code{0.3} printed as the last number in this example:
11683 $ seq -s' ' 0 .1 .3
11687 But that doesn't happen on most systems because @command{seq} is
11688 implemented using binary floating point arithmetic (via the C
11689 @code{double} type) -- which means some decimal numbers like @code{.1}
11690 cannot be represented exactly. That in turn means some nonintuitive
11691 conditions like @code{.1 * 3 > .3} will end up being true.
11693 To work around that in the above example, use a slightly larger number as
11694 the @var{last} value:
11697 $ seq -s' ' 0 .1 .31
11701 In general, when using an @var{increment} with a fractional part, where
11702 (@var{last} - @var{first}) / @var{increment} is (mathematically) a whole
11703 number, specify a slightly larger (or smaller, if @var{increment} is negative)
11704 value for @var{last} to ensure that @var{last} is the final value printed
11707 @node File permissions
11708 @chapter File permissions
11711 @include getdate.texi
11715 @node Opening the software toolbox
11716 @chapter Opening the Software Toolbox
11718 This chapter originally appeared in @cite{Linux Journal}, volume 1,
11719 number 2, in the @cite{What's GNU?} column. It was written by Arnold
11723 * Toolbox introduction:: Toolbox introduction
11724 * I/O redirection:: I/O redirection
11725 * The who command:: The @command{who} command
11726 * The cut command:: The @command{cut} command
11727 * The sort command:: The @command{sort} command
11728 * The uniq command:: The @command{uniq} command
11729 * Putting the tools together:: Putting the tools together
11733 @node Toolbox introduction
11734 @unnumberedsec Toolbox Introduction
11736 This month's column is only peripherally related to the GNU Project, in
11737 that it describes a number of the GNU tools on your GNU/Linux system and how they
11738 might be used. What it's really about is the ``Software Tools'' philosophy
11739 of program development and usage.
11741 The software tools philosophy was an important and integral concept
11742 in the initial design and development of Unix (of which Linux and GNU are
11743 essentially clones). Unfortunately, in the modern day press of
11744 Internetworking and flashy GUIs, it seems to have fallen by the
11745 wayside. This is a shame, since it provides a powerful mental model
11746 for solving many kinds of problems.
11748 Many people carry a Swiss Army knife around in their pants pockets (or
11749 purse). A Swiss Army knife is a handy tool to have: it has several knife
11750 blades, a screwdriver, tweezers, toothpick, nail file, corkscrew, and perhaps
11751 a number of other things on it. For the everyday, small miscellaneous jobs
11752 where you need a simple, general purpose tool, it's just the thing.
11754 On the other hand, an experienced carpenter doesn't build a house using
11755 a Swiss Army knife. Instead, he has a toolbox chock full of specialized
11756 tools---a saw, a hammer, a screwdriver, a plane, and so on. And he knows
11757 exactly when and where to use each tool; you won't catch him hammering nails
11758 with the handle of his screwdriver.
11760 The Unix developers at Bell Labs were all professional programmers and trained
11761 computer scientists. They had found that while a one-size-fits-all program
11762 might appeal to a user because there's only one program to use, in practice
11767 difficult to write,
11770 difficult to maintain and
11774 difficult to extend to meet new situations.
11777 Instead, they felt that programs should be specialized tools. In short, each
11778 program ``should do one thing well.'' No more and no less. Such programs are
11779 simpler to design, write, and get right---they only do one thing.
11781 Furthermore, they found that with the right machinery for hooking programs
11782 together, that the whole was greater than the sum of the parts. By combining
11783 several special purpose programs, you could accomplish a specific task
11784 that none of the programs was designed for, and accomplish it much more
11785 quickly and easily than if you had to write a special purpose program.
11786 We will see some (classic) examples of this further on in the column.
11787 (An important additional point was that, if necessary, take a detour
11788 and build any software tools you may need first, if you don't already
11789 have something appropriate in the toolbox.)
11791 @node I/O redirection
11792 @unnumberedsec I/O Redirection
11794 Hopefully, you are familiar with the basics of I/O redirection in the
11795 shell, in particular the concepts of ``standard input,'' ``standard output,''
11796 and ``standard error''. Briefly, ``standard input'' is a data source, where
11797 data comes from. A program should not need to either know or care if the
11798 data source is a disk file, a keyboard, a magnetic tape, or even a punched
11799 card reader. Similarly, ``standard output'' is a data sink, where data goes
11800 to. The program should neither know nor care where this might be.
11801 Programs that only read their standard input, do something to the data,
11802 and then send it on, are called @dfn{filters}, by analogy to filters in a
11805 With the Unix shell, it's very easy to set up data pipelines:
11808 program_to_create_data | filter1 | .... | filterN > final.pretty.data
11811 We start out by creating the raw data; each filter applies some successive
11812 transformation to the data, until by the time it comes out of the pipeline,
11813 it is in the desired form.
11815 This is fine and good for standard input and standard output. Where does the
11816 standard error come in to play? Well, think about @command{filter1} in
11817 the pipeline above. What happens if it encounters an error in the data it
11818 sees? If it writes an error message to standard output, it will just
11819 disappear down the pipeline into @command{filter2}'s input, and the
11820 user will probably never see it. So programs need a place where they can send
11821 error messages so that the user will notice them. This is standard error,
11822 and it is usually connected to your console or window, even if you have
11823 redirected standard output of your program away from your screen.
11825 For filter programs to work together, the format of the data has to be
11826 agreed upon. The most straightforward and easiest format to use is simply
11827 lines of text. Unix data files are generally just streams of bytes, with
11828 lines delimited by the @acronym{ASCII} @sc{lf} (Line Feed) character,
11829 conventionally called a ``newline'' in the Unix literature. (This is
11830 @code{'\n'} if you're a C programmer.) This is the format used by all
11831 the traditional filtering programs. (Many earlier operating systems
11832 had elaborate facilities and special purpose programs for managing
11833 binary data. Unix has always shied away from such things, under the
11834 philosophy that it's easiest to simply be able to view and edit your
11835 data with a text editor.)
11837 OK, enough introduction. Let's take a look at some of the tools, and then
11838 we'll see how to hook them together in interesting ways. In the following
11839 discussion, we will only present those command line options that interest
11840 us. As you should always do, double check your system documentation
11841 for the full story.
11843 @node The who command
11844 @unnumberedsec The @command{who} Command
11846 The first program is the @command{who} command. By itself, it generates a
11847 list of the users who are currently logged in. Although I'm writing
11848 this on a single-user system, we'll pretend that several people are
11853 @print{} arnold console Jan 22 19:57
11854 @print{} miriam ttyp0 Jan 23 14:19(:0.0)
11855 @print{} bill ttyp1 Jan 21 09:32(:0.0)
11856 @print{} arnold ttyp2 Jan 23 20:48(:0.0)
11859 Here, the @samp{$} is the usual shell prompt, at which I typed @samp{who}.
11860 There are three people logged in, and I am logged in twice. On traditional
11861 Unix systems, user names are never more than eight characters long. This
11862 little bit of trivia will be useful later. The output of @command{who} is nice,
11863 but the data is not all that exciting.
11865 @node The cut command
11866 @unnumberedsec The @command{cut} Command
11868 The next program we'll look at is the @command{cut} command. This program
11869 cuts out columns or fields of input data. For example, we can tell it
11870 to print just the login name and full name from the @file{/etc/passwd}
11871 file. The @file{/etc/passwd} file has seven fields, separated by
11875 arnold:xyzzy:2076:10:Arnold D. Robbins:/home/arnold:/bin/bash
11878 To get the first and fifth fields, we would use @command{cut} like this:
11881 $ cut -d: -f1,5 /etc/passwd
11882 @print{} root:Operator
11884 @print{} arnold:Arnold D. Robbins
11885 @print{} miriam:Miriam A. Robbins
11889 With the @option{-c} option, @command{cut} will cut out specific characters
11890 (i.e., columns) in the input lines. This is useful for input data
11891 that has fixed width fields, and does not have a field separator. For
11892 example, list the Monday dates for the current month:
11894 @c Is using cal ok? Looked at gcal, but I don't like it.
11905 Cut can also add field separators to fixed width data, using the
11906 @option{--output-delimiter} option. This can be very useful to fill a
11909 @c [Why] can't that silly total line for directories be switched off?
11911 $ ls -ld ~/* | cut --output-delimiter=, -c1,2-4,5-7,8-10,57- | tee home.cs
11912 @print{} d,rwx,r-x,r-x,CVS
11913 @print{} d,rwx,---,---,Mail
11914 @print{} d,rwx,r-x,r-x,lilypond
11915 @print{} d,rwx,r-x,r-x,savannah
11916 $ mysql -e 'create table home \
11917 (d char(1),u char(3), g char (3), o char (3), name text)' test
11918 $ mysqlimport --fields-terminated-by=, test home.cs
11919 @print{} test.home: Records: 4 Deleted: 0 Skipped: 0 Warnings: 0
11920 $ mysql -e 'select * from home' test
11921 @print{} +------+------+------+------+----------+
11922 @print{} | d | u | g | o | name |
11923 @print{} +------+------+------+------+----------+
11924 @print{} | d | rwx | r-x | r-x | CVS |
11925 @print{} | d | rwx | --- | --- | Mail |
11926 @print{} | d | rwx | r-x | r-x | lilypond |
11927 @print{} | d | rwx | r-x | r-x | savannah |
11928 @print{} +------+------+------+------+----------+
11931 But beware of assumptions.
11932 The above invocation of @command{ls} assumes that the owner
11933 and group names are no longer than eight bytes each,
11934 and that no file has size larger than 99999999 bytes.
11935 Otherwise, the byte offset of @samp{57} would need to be larger.
11936 To avoid such problems, suppress output of the owner and group
11937 names with the @option{-g} and @option{-G} options respectively,
11938 and add the @option{-h} option to ensure that the representation
11939 of the size of the file does not exceed the allotted space.
11940 Finally, note that the width of even the date/time field may change,
11941 depending on the current locale. To avoid that, use an option
11942 like @option{--time-style='+%Y-%m-%d %H:%M:%S'}.
11944 And there's still another problem: if a file has more
11945 than 999 hard links to it, then that will change the alignment.
11946 The morale is that it is hard to use fixed byte offsets into
11947 a line of @command{ls} output. Use a different tool, like
11948 find, but with @option{-printf} and carefully chosen format strings.
11950 @node The sort command
11951 @unnumberedsec The @command{sort} Command
11953 Next we'll look at the @command{sort} command. This is one of the most
11954 powerful commands on a Unix-style system; one that you will often find
11955 yourself using when setting up fancy data plumbing.
11958 command reads and sorts each file named on the command line. It then
11959 merges the sorted data and writes it to standard output. It will read
11960 standard input if no files are given on the command line (thus
11961 making it into a filter). The sort is based on the character collating
11962 sequence or based on user-supplied ordering criteria.
11965 @node The uniq command
11966 @unnumberedsec The @command{uniq} Command
11968 Finally (at least for now), we'll look at the @command{uniq} program. When
11969 sorting data, you will often end up with duplicate lines, lines that
11970 are identical. Usually, all you need is one instance of each line.
11971 This is where @command{uniq} comes in. The @command{uniq} program reads its
11972 standard input, which it expects to be sorted. It only prints out one
11973 copy of each duplicated line. It does have several options. Later on,
11974 we'll use the @option{-c} option, which prints each unique line, preceded
11975 by a count of the number of times that line occurred in the input.
11978 @node Putting the tools together
11979 @unnumberedsec Putting the Tools Together
11981 Now, let's suppose this is a large ISP server system with dozens of users
11982 logged in. The management wants the system administrator to write a program that will
11983 generate a sorted list of logged in users. Furthermore, even if a user
11984 is logged in multiple times, his or her name should only show up in the
11987 The administrator could sit down with the system documentation and write a C
11988 program that did this. It would take perhaps a couple of hundred lines
11989 of code and about two hours to write it, test it, and debug it.
11990 However, knowing the software toolbox, the administrator can instead start out
11991 by generating just a list of logged on users:
12001 Next, sort the list:
12004 $ who | cut -c1-8 | sort
12011 Finally, run the sorted list through @command{uniq}, to weed out duplicates:
12014 $ who | cut -c1-8 | sort | uniq
12020 The @command{sort} command actually has a @option{-u} option that does what
12021 @command{uniq} does. However, @command{uniq} has other uses for which one
12022 cannot substitute @samp{sort -u}.
12024 The administrator puts this pipeline into a shell script, and makes it available for
12025 all the users on the system (@samp{#} is the system administrator,
12026 or @code{root}, prompt):
12029 # cat > /usr/local/bin/listusers
12030 who | cut -c1-8 | sort | uniq
12032 # chmod +x /usr/local/bin/listusers
12035 There are four major points to note here. First, with just four
12036 programs, on one command line, the administrator was able to save about two
12037 hours worth of work. Furthermore, the shell pipeline is just about as
12038 efficient as the C program would be, and it is much more efficient in
12039 terms of programmer time. People time is much more expensive than
12040 computer time, and in our modern ``there's never enough time to do
12041 everything'' society, saving two hours of programmer time is no mean
12044 Second, it is also important to emphasize that with the
12045 @emph{combination} of the tools, it is possible to do a special
12046 purpose job never imagined by the authors of the individual programs.
12048 Third, it is also valuable to build up your pipeline in stages, as we did here.
12049 This allows you to view the data at each stage in the pipeline, which helps
12050 you acquire the confidence that you are indeed using these tools correctly.
12052 Finally, by bundling the pipeline in a shell script, other users can use
12053 your command, without having to remember the fancy plumbing you set up for
12054 them. In terms of how you run them, shell scripts and compiled programs are
12057 After the previous warm-up exercise, we'll look at two additional, more
12058 complicated pipelines. For them, we need to introduce two more tools.
12060 The first is the @command{tr} command, which stands for ``transliterate.''
12061 The @command{tr} command works on a character-by-character basis, changing
12062 characters. Normally it is used for things like mapping upper case to
12066 $ echo ThIs ExAmPlE HaS MIXED case! | tr '[A-Z]' '[a-z]'
12067 @print{} this example has mixed case!
12070 There are several options of interest:
12074 work on the complement of the listed characters, i.e.,
12075 operations apply to characters not in the given set
12078 delete characters in the first set from the output
12081 squeeze repeated characters in the output into just one character.
12084 We will be using all three options in a moment.
12086 The other command we'll look at is @command{comm}. The @command{comm}
12087 command takes two sorted input files as input data, and prints out the
12088 files' lines in three columns. The output columns are the data lines
12089 unique to the first file, the data lines unique to the second file, and
12090 the data lines that are common to both. The @option{-1}, @option{-2}, and
12091 @option{-3} command line options @emph{omit} the respective columns. (This is
12092 non-intuitive and takes a little getting used to.) For example:
12114 The single dash as a filename tells @command{comm} to read standard input
12115 instead of a regular file.
12117 Now we're ready to build a fancy pipeline. The first application is a word
12118 frequency counter. This helps an author determine if he or she is over-using
12121 The first step is to change the case of all the letters in our input file
12122 to one case. ``The'' and ``the'' are the same word when doing counting.
12125 $ tr '[A-Z]' '[a-z]' < whats.gnu | ...
12128 The next step is to get rid of punctuation. Quoted words and unquoted words
12129 should be treated identically; it's easiest to just get the punctuation out of
12133 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' | ...
12136 The second @command{tr} command operates on the complement of the listed
12137 characters, which are all the letters, the digits, the underscore, and
12138 the blank. The @samp{\012} represents the newline character; it has to
12139 be left alone. (The @acronym{ASCII} tab character should also be included for
12140 good measure in a production script.)
12142 At this point, we have data consisting of words separated by blank space.
12143 The words only contain alphanumeric characters (and the underscore). The
12144 next step is break the data apart so that we have one word per line. This
12145 makes the counting operation much easier, as we will see shortly.
12148 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' |
12149 > tr -s '[ ]' '\012' | ...
12152 This command turns blanks into newlines. The @option{-s} option squeezes
12153 multiple newline characters in the output into just one. This helps us
12154 avoid blank lines. (The @samp{>} is the shell's ``secondary prompt.''
12155 This is what the shell prints when it notices you haven't finished
12156 typing in all of a command.)
12158 We now have data consisting of one word per line, no punctuation, all one
12159 case. We're ready to count each word:
12162 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' |
12163 > tr -s '[ ]' '\012' | sort | uniq -c | ...
12166 At this point, the data might look something like this:
12179 The output is sorted by word, not by count! What we want is the most
12180 frequently used words first. Fortunately, this is easy to accomplish,
12181 with the help of two more @command{sort} options:
12185 do a numeric sort, not a textual one
12188 reverse the order of the sort
12191 The final pipeline looks like this:
12194 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' |
12195 > tr -s '[ ]' '\012' | sort | uniq -c | sort -nr
12204 Whew! That's a lot to digest. Yet, the same principles apply. With six
12205 commands, on two lines (really one long one split for convenience), we've
12206 created a program that does something interesting and useful, in much
12207 less time than we could have written a C program to do the same thing.
12209 A minor modification to the above pipeline can give us a simple spelling
12210 checker! To determine if you've spelled a word correctly, all you have to
12211 do is look it up in a dictionary. If it is not there, then chances are
12212 that your spelling is incorrect. So, we need a dictionary.
12213 The conventional location for a dictionary is @file{/usr/dict/words}.
12214 On my GNU/Linux system,@footnote{Redhat Linux 6.1, for the November 2000
12215 revision of this article.}
12216 this is a is a sorted, 45,402 word dictionary.
12218 Now, how to compare our file with the dictionary? As before, we generate
12219 a sorted list of words, one per line:
12222 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' |
12223 > tr -s '[ ]' '\012' | sort -u | ...
12226 Now, all we need is a list of words that are @emph{not} in the
12227 dictionary. Here is where the @command{comm} command comes in.
12230 $ tr '[A-Z]' '[a-z]' < whats.gnu | tr -cd '[A-Za-z0-9_ \012]' |
12231 > tr -s '[ ]' '\012' | sort -u |
12232 > comm -23 - /usr/dict/words
12235 The @option{-2} and @option{-3} options eliminate lines that are only in the
12236 dictionary (the second file), and lines that are in both files. Lines
12237 only in the first file (standard input, our stream of words), are
12238 words that are not in the dictionary. These are likely candidates for
12239 spelling errors. This pipeline was the first cut at a production
12240 spelling checker on Unix.
12242 There are some other tools that deserve brief mention.
12246 search files for text that matches a regular expression
12249 count lines, words, characters
12252 a T-fitting for data pipes, copies data to files and to standard output
12255 the stream editor, an advanced tool
12258 a data manipulation language, another advanced tool
12261 The software tools philosophy also espoused the following bit of
12262 advice: ``Let someone else do the hard part.'' This means, take
12263 something that gives you most of what you need, and then massage it the
12264 rest of the way until it's in the form that you want.
12270 Each program should do one thing well. No more, no less.
12273 Combining programs with appropriate plumbing leads to results where
12274 the whole is greater than the sum of the parts. It also leads to novel
12275 uses of programs that the authors might never have imagined.
12278 Programs should never print extraneous header or trailer data, since these
12279 could get sent on down a pipeline. (A point we didn't mention earlier.)
12282 Let someone else do the hard part.
12285 Know your toolbox! Use each program appropriately. If you don't have an
12286 appropriate tool, build one.
12289 As of this writing, all the programs we've discussed are available via
12290 anonymous @command{ftp} from: @*
12291 @uref{ftp://gnudist.gnu.org/textutils/textutils-1.22.tar.gz}. (There may
12292 be more recent versions available now.)
12294 None of what I have presented in this column is new. The Software Tools
12295 philosophy was first introduced in the book @cite{Software Tools}, by
12296 Brian Kernighan and P.J. Plauger (Addison-Wesley, ISBN 0-201-03669-X).
12297 This book showed how to write and use software tools. It was written in
12298 1976, using a preprocessor for FORTRAN named @command{ratfor} (RATional
12299 FORtran). At the time, C was not as ubiquitous as it is now; FORTRAN
12300 was. The last chapter presented a @command{ratfor} to FORTRAN
12301 processor, written in @command{ratfor}. @command{ratfor} looks an awful
12302 lot like C; if you know C, you won't have any problem following the
12305 In 1981, the book was updated and made available as @cite{Software Tools
12306 in Pascal} (Addison-Wesley, ISBN 0-201-10342-7). The first book is
12307 still in print; the second, alas, is not. Both books are well worth
12308 reading if you're a programmer. They certainly made a major change in
12309 how I view programming.
12311 Initially, the programs in both books were available (on 9-track tape)
12312 from Addison-Wesley. Unfortunately, this is no longer the case,
12313 although the @command{ratfor} versions are available from
12314 @uref{http://cm.bell-labs.come/who/bwk, Brian Kernighan's home page},
12315 and you might be able to find copies of the Pascal versions floating
12316 around the Internet. For a number of years, there was an active
12317 Software Tools Users Group, whose members had ported the original
12318 @command{ratfor} programs to essentially every computer system with a
12319 FORTRAN compiler. The popularity of the group waned in the middle 1980s
12320 as Unix began to spread beyond universities.
12322 With the current proliferation of GNU code and other clones of Unix programs,
12323 these programs now receive little attention; modern C versions are
12324 much more efficient and do more than these programs do. Nevertheless, as
12325 exposition of good programming style, and evangelism for a still-valuable
12326 philosophy, these books are unparalleled, and I recommend them highly.
12328 Acknowledgment: I would like to express my gratitude to Brian Kernighan
12329 of Bell Labs, the original Software Toolsmith, for reviewing this column.
12331 @include doclicense.texi
12342 @c Local variables:
12343 @c texinfo-column-for-description: 32