1 .\" $NetBSD: sort.1,v 1.28 2009/08/22 21:55:08 dsl Exp $
3 .\" Copyright (c) 2000-2003 The NetBSD Foundation, Inc.
4 .\" All rights reserved.
6 .\" This code is derived from software contributed to The NetBSD Foundation
7 .\" by Ben Harris and Jaromir Dolecek.
9 .\" Redistribution and use in source and binary forms, with or without
10 .\" modification, are permitted provided that the following conditions
12 .\" 1. Redistributions of source code must retain the above copyright
13 .\" notice, this list of conditions and the following disclaimer.
14 .\" 2. Redistributions in binary form must reproduce the above copyright
15 .\" notice, this list of conditions and the following disclaimer in the
16 .\" documentation and/or other materials provided with the distribution.
18 .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
19 .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
20 .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
21 .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
22 .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23 .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24 .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25 .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26 .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27 .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28 .\" POSSIBILITY OF SUCH DAMAGE.
30 .\" Copyright (c) 1991, 1993
31 .\" The Regents of the University of California. All rights reserved.
33 .\" This code is derived from software contributed to Berkeley by
34 .\" the Institute of Electrical and Electronics Engineers, Inc.
36 .\" Redistribution and use in source and binary forms, with or without
37 .\" modification, are permitted provided that the following conditions
39 .\" 1. Redistributions of source code must retain the above copyright
40 .\" notice, this list of conditions and the following disclaimer.
41 .\" 2. Redistributions in binary form must reproduce the above copyright
42 .\" notice, this list of conditions and the following disclaimer in the
43 .\" documentation and/or other materials provided with the distribution.
44 .\" 3. Neither the name of the University nor the names of its contributors
45 .\" may be used to endorse or promote products derived from this software
46 .\" without specific prior written permission.
48 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
49 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
50 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
51 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
52 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
53 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
54 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
55 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
56 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
57 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
60 .\" @(#)sort.1 8.1 (Berkeley) 6/6/93
67 .Nd sort or merge text files
73 .Ar field1 Ns Op Li \&, Ns Ar field2
83 utility sorts text files by lines.
84 Comparisons are based on one or more sort keys extracted
85 from each line of input, and are performed lexicographically.
86 By default, if keys are not given,
88 regards each input line as a single field.
90 The following options are available:
93 Check that the single input file is sorted.
94 If the file is not sorted,
96 produces the appropriate error messages and exits with code 1; otherwise,
103 Ignored for compatibility with earlier versions of
106 Merge only; the input files are assumed to be pre-sorted.
108 The argument given is the name of an
110 file to be used instead of the standard output.
111 This file can be the same as one of the input files.
113 Don't use stable sort.
114 Default is to use stable sort.
116 Use stable sort, keeps records with equal keys in their original order.
118 Provided for compatibility with other
120 implementations only.
124 as the directory for temporary files.
125 The default is the value specified in the environment variable
132 Unique: suppress all but one in each set of lines having equal keys.
135 option, check that there are no lines with duplicate keys.
138 The following options override the default ordering rules.
139 When ordering options appear independent of key field
140 specifications, the requested field ordering rules are
141 applied globally to all sort keys.
142 When attached to a specific key (see
144 the ordering options override
145 all global ordering options for that key.
148 Only blank space and alphanumeric characters
150 .\" to the current setting of LC_CTYPE
152 in making comparisons.
154 Considers all lowercase characters that have uppercase
155 equivalents to be the same for purposes of comparison.
157 Ignore all non-printable characters.
159 An initial numeric string, consisting of optional blank space, optional
160 minus sign, and zero or more digits (including decimal point)
162 .\" optional radix character and thousands
164 .\" (as defined in the current locale),
165 is sorted by arithmetic value.
168 option no longer implies the
172 Reverse the sense of comparisons.
175 The treatment of field separators can be altered using these options:
178 Ignores leading blank space when determining the start
179 and end of a restricted sort key.
182 option specified before the first
184 option applies globally to all
189 option can be attached independently to each
196 option has no effect unless key fields are specified.
199 is used as the field separator character.
202 is not considered to be part of a field when determining
203 key offsets (see below).
206 is significant (for example,
208 delimits an empty field).
211 is not specified, the default field separator is a sequence of
212 blank-space characters, and consecutive blank spaces do
214 delimit an empty field; further, the initial blank space
216 considered part of a field when determining key offsets.
219 is used as the record separator character.
220 This should be used with discretion;
221 .Fl R Ar \*[Lt]alphanumeric\*[Gt]
222 usually produces undesirable results.
223 The default record separator is newline.
224 .It Fl k Ar field1 Ns Op Li \&, Ns Ar field2
225 Designates the starting position,
227 and optional ending position,
232 option replaces the obsolescent options
238 The following operands are available:
241 The pathname of a file to be sorted, merged, or checked.
244 operands are specified, or if
249 the standard input is used.
252 A field is defined as a minimal sequence of characters followed by a
253 field separator or a newline character.
254 By default, the first
255 blank space of a sequence of blank spaces acts as the field separator.
256 All blank spaces in a sequence of blank spaces are considered
257 as part of the next field; for example, all blank spaces at
258 the beginning of a line are considered to be part of the
264 .Ar field1 Ns Op \&, Ns Ar field2
268 argument defaults to the end of a line.
275 .Ar m Ns Li \&. Ns Ar n
276 and can be followed by one or more of the letters
281 which correspond to the options discussed above.
284 position specified by
285 .Ar m Ns Li \&. Ns Ar n
286 .Pq Ar m , n No \*[Gt] 0
287 is interpreted as the
298 indicating the first character of the
304 is counted from the first non-blank character in the
308 refers to the first non-blank character in the
314 position specified by
315 .Ar m Ns Li \&. Ns Ar n
319 character (including separators) of the
324 indicates the last character of the
329 designates the end of a line.
334 .Ar v Li \&. Ar x Li \&,
338 is synonymous with the obsolescent option
340 .Cm \(pl Ar v-\&1 Li \&. Ar x-\&1
341 .Fl Ar w-\&1 Li \&. Ar y ;
348 .Ar v Li \&. Ar x Li \&, Ar w
352 .Cm \(pl Ar v-\&1 Li \&. Ar x-\&1
358 option is still supported, except for
359 .Fl Ns Ar w Ns Li \&.0b ,
364 Sort exits with one of the following values:
365 .Bl -tag -width flag -compact
369 On disorder (or non-uniqueness) with the
376 If the following environment variable exists, it is used by
381 uses the contents of the
383 environment variable as the path in which to store
387 .Bl -tag -width outputNUMBER+some -compact
389 Default temporary files.
390 .It Ar output Ns NUMBER
391 Temporary file which is used for output if
394 Once sorting is finished, this file replaces
414 implementation appeared in
419 Posix requires the locale's thousands separator be ignored in numbers.
420 It may be faster to sort very large files in pieces and then explicitly
425 has no limits on input line length (other than imposed by available
426 memory) or any restrictions on bytes allowed within lines.
435 and thus fails on protected directories.
437 Input files should be text files.
438 If file doesn't end with record separator (which is typically newline), the
440 utility silently supplies one.
444 uses lexicographic radix sorting, which requires
445 that sort keys be kept in memory (as opposed to previous versions which used quick
446 and merge sorts and did not.)
447 Thus performance depends highly on efficient choice of sort keys, and the
453 option should be used whenever possible.
460 and may take twice as long.