1 .\" $NetBSD: tr.1,v 1.21 2013/08/10 20:59:27 dholland Exp $
3 .\" Copyright (c) 1991, 1993
4 .\" The Regents of the University of California. All rights reserved.
6 .\" This code is derived from software contributed to Berkeley by
7 .\" the Institute of Electrical and Electronics Engineers, Inc.
9 .\" Redistribution and use in source and binary forms, with or without
10 .\" modification, are permitted provided that the following conditions
12 .\" 1. Redistributions of source code must retain the above copyright
13 .\" notice, this list of conditions and the following disclaimer.
14 .\" 2. Redistributions in binary form must reproduce the above copyright
15 .\" notice, this list of conditions and the following disclaimer in the
16 .\" documentation and/or other materials provided with the distribution.
17 .\" 3. Neither the name of the University nor the names of its contributors
18 .\" may be used to endorse or promote products derived from this software
19 .\" without specific prior written permission.
21 .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
22 .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
23 .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
24 .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
25 .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
26 .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
27 .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
28 .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
29 .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
30 .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
33 .\" @(#)tr.1 8.1 (Berkeley) 6/6/93
40 .Nd translate characters
60 utility copies the standard input to the standard output with substitution
61 or deletion of selected characters.
63 The following options are available:
66 Complements the set of characters in
70 includes every character except for
77 option causes characters to be deleted from the input.
81 option squeezes multiple occurrences of the characters listed in the last
86 in the input into a single instance of the character.
87 This occurs after all deletion and translation is completed.
90 In the first synopsis form, the characters in
92 are translated into the characters in
94 where the first character in
96 is translated into the first character in
103 the last character found in
109 In the second synopsis form, the characters in
111 are deleted from the input.
113 In the third synopsis form, the characters in
115 are compressed as described for the
119 In the fourth synopsis form, the characters in
121 are deleted from the input, and the characters in
123 are compressed as described for the
127 The following conventions can be used in
131 to specify sets of characters:
132 .Bl -tag -width [:equiv:]
134 Any character not described by one of the following conventions
137 A backslash followed by 1, 2 or 3 octal digits represents a character
138 with that encoded value.
139 To follow an octal sequence with a digit as a character, left zero-pad
140 the octal sequence to the full 3 octal digits.
142 A backslash followed by certain special characters maps to special
146 .It \ea \*[Lt]alert character\*[Gt]
147 .It \eb \*[Lt]backspace\*[Gt]
148 .It \ef \*[Lt]form-feed\*[Gt]
149 .It \en \*[Lt]newline\*[Gt]
150 .It \er \*[Lt]carriage return\*[Gt]
151 .It \et \*[Lt]tab\*[Gt]
152 .It \ev \*[Lt]vertical tab\*[Gt]
155 A backslash followed by any other character maps to that character.
157 Represents the range of characters between the range endpoints, inclusively.
159 Represents all characters belonging to the defined character class.
163 .It alnum \*[Lt]alphanumeric characters\*[Gt]
164 .It alpha \*[Lt]alphabetic characters\*[Gt]
165 .It blank \*[Lt]blank characters\*[Gt]
166 .It cntrl \*[Lt]control characters\*[Gt]
167 .It digit \*[Lt]numeric characters\*[Gt]
168 .It graph \*[Lt]graphic characters\*[Gt]
169 .It lower \*[Lt]lower-case alphabetic characters\*[Gt]
170 .It print \*[Lt]printable characters\*[Gt]
171 .It punct \*[Lt]punctuation characters\*[Gt]
172 .It space \*[Lt]space characters\*[Gt]
173 .It upper \*[Lt]upper-case characters\*[Gt]
174 .It xdigit \*[Lt]hexadecimal characters\*[Gt]
177 .\" All classes may be used in
185 .\" options are specified.
186 .\" Otherwise, only the classes ``upper'' and ``lower'' may be used in
188 .\" and then only when the corresponding class (``upper'' for ``lower''
189 .\" and vice-versa) is specified in the same relative position in
192 With the exception of the
196 classes, characters in the classes are in unspecified order.
201 classes, characters are entered in ascending order.
203 For specific information as to which ASCII characters are included
204 in these classes, see
206 and related manual pages.
208 Represents all characters or collating (sorting) elements belonging to
209 the same equivalence class as
211 If there is a secondary ordering within the equivalence class, the
212 characters are ordered in ascending sequence.
213 Otherwise, they are ordered after their encoded values.
214 An example of an equivalence class might be
219 English has no equivalence classes.
223 repeated occurrences of the character represented by
226 expression is only valid when it occurs in
230 is omitted or is zero, it is interpreted as large enough to extend the
232 sequence to the length of
236 has a leading zero, it is interpreted as an octal value;
237 otherwise, it is interpreted as a decimal value.
242 The following examples are shown as given to the shell:
244 Create a list of the words in
246 one per line, where a word is taken to be a maximal string of letters:
248 .D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q \*[Lt] file1"
250 Translate the contents of
254 .D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q \*[Lt] file1"
256 Strip out non-printable characters from
259 .D1 Li "tr -cd \*q[:print:]\*q \*[Lt] file1"
262 has historically implemented character ranges using the syntax
268 implementations and standardized by POSIX.
270 shell scripts should work under this implementation as long as
271 the range is intended to map in another range, i.e. the command
275 will work as it will map the
283 However, if the shell script is deleting or squeezing characters as in
292 will be included in the deletion or compression list which would
293 not have happened under an historic
296 Additionally, any scripts that depended on the sequence
298 to represent the three characters
303 will have to be rewritten as
308 utility has historically not permitted the manipulation of NUL bytes in
309 its input and, additionally, stripped NUL's from its input stream.
310 This implementation has removed this behavior as a bug.
314 utility has historically been extremely forgiving of syntax errors,
319 options were ignored unless two strings were specified.
320 This implementation will not permit illegal syntax.
327 utility is expected to be
330 It should be noted that the feature wherein the last character of
334 has less characters than
336 is permitted by POSIX but is not required.
337 Shell scripts attempting to be portable to other POSIX systems should use
340 convention instead of relying on this behavior.
343 was originally designed to work with
345 Its use with character sets that do not share all the properties of
347 e.g., a symmetric set of upper and lower case characters
348 that can be algorithmically converted one to the other,
349 may yield unpredictable results.
352 should be internationalized.