1 .\" $NetBSD: awk.1,v 1.18 2009/03/09 14:10:12 joerg Exp $
3 .\" Copyright (C) Lucent Technologies 1997
4 .\" All Rights Reserved
6 .\" Permission to use, copy, modify, and distribute this software and
7 .\" its documentation for any purpose and without fee is hereby
8 .\" granted, provided that the above copyright notice appear in all
9 .\" copies and that both that the copyright notice and this
10 .\" permission notice and warranty disclaimer appear in supporting
11 .\" documentation, and that the name Lucent Technologies or any of
12 .\" its entities not be used in advertising or publicity pertaining
13 .\" to distribution of the software without specific, written prior
16 .\" LUCENT DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE,
17 .\" INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS.
18 .\" IN NO EVENT SHALL LUCENT OR ANY OF ITS ENTITIES BE LIABLE FOR ANY
19 .\" SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
20 .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
21 .\" IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
22 .\" ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
30 .Nd pattern-directed scanning and processing language
37 .Op Ar prog | Fl f Ar filename
43 is the Bell Labs' implementation of the AWK programming language as
45 .Em The AWK Programming Language
47 A. V. Aho, B. W. Kernighan, and P. J. Weinberger.
52 for lines that match any of a set of patterns specified literally in
54 or in one or more files
58 there can be an associated action that will be performed
62 Each line is matched against the
63 pattern portion of every pattern-action statement;
64 the associated action is performed for each matched pattern.
67 means the standard input.
72 is treated as an assignment, not a filename,
73 and is executed at the time it would have been opened if it were a filename.
75 The options are as follows:
76 .Bl -tag -width indent
78 Set debug level to specified number
80 If the number is omitted, debug level is set to 1.
82 Read the AWK program source from specified file
84 instead of the first command line argument.
87 options may be specified.
89 Set the input field separator
91 to the regular expression
93 .It Fl mr Ar NNN , Fl mf Ar NNN
94 Obsolete, no longer needed options.
95 Set limit on maximum record or
98 Potentially unsafe functions such as
100 make the program abort (with a warning message).
101 .It Fl v Ar var Ns = Ns Ar value
111 options may be present.
115 version on standard output and exit.
118 An input line is normally made up of fields separated by white space,
119 or by regular expression
121 The fields are denoted
126 refers to the entire line.
129 is null, the input line is split into one field per character.
131 A pattern-action statement has the form
133 .Dl pattern \&{ action \&}
135 A missing \&{ action \&}
136 means print the line;
137 a missing pattern always matches.
138 Pattern-action statements are separated by newlines or semicolons.
140 An action is a sequence of statements.
141 Statements are terminated by
142 semicolons, newlines or right braces.
147 String constants are quoted
149 with the usual C escapes recognized within.
150 Expressions take on string or numeric values as appropriate,
151 and are built using the
153 (see next subsection).
154 Variables may be scalars, array elements
158 Variables are initialized to the null string.
159 Array subscripts may be any string,
160 not necessarily numeric;
161 this allows for a form of associative memory.
162 Multiple subscripts such as
164 are permitted; the constituents are concatenated,
165 separated by the value of
169 operators, in order of decreasing precedence, are:
171 .Bl -tag -width ident -compact
177 Increment and decrement, can be used either as postfix or prefix.
181 form is also supported, and
183 for the assignment operator).
185 Unary plus, unary minus and logical negation.
187 Multiplication, division and modulus.
189 Addition and subtraction.
191 String concatenation.
195 Regular relational operators
197 Regular expression match and not match
200 .It Ic "\*[Am]\*[Am]"
205 C conditional expression.
207 .Ar expr1 Ic \&? Ar expr2 Ic \&: Ar expr3 No .
210 is true, the result value is
221 Assignment and Operator-Assignment
223 .Ss Control Statements
224 The control statements are as follows:
226 .Bl -hang -offset indent -width indent -compact
227 .It Ic if \&( Ar expression Ic \&) Ar statement Bq Ic else Ar statement
228 .It Ic while \&( Ar expression Ic \&) Ar statement
229 .It Ic for \&( Ar expression Ic \&; Ar expression Ic \&; \
230 Ar expression Ic \&) Ar statement
231 .It Ic for \&( Va var Ic in Ar array Ic \&) Ar statement
232 .It Ic do Ar statement Ic while \&( Ar expression Ic \&)
235 .It Ic delete Va array Bq Ar expression
236 .It Ic delete Va array
237 .It Ic exit Bq Ar expression
239 .It Ic return Bq Ar expression
240 .It Ic \&{ Ar [ statement ... ] Ic \&}
243 The input/output statements are as follows:
245 .Bl -tag -width indent
247 Closes the file or pipe
249 Returns zero on success; otherwise nonzero.
251 Flushes any buffered output for the file or pipe
253 Returns zero on success; otherwise nonzero.
254 .It Ic getline Bq Va var
261 to the next input record from the current input file.
263 returns 1 for a successful input,
264 0 for end of file, and \-1 for an error.
265 .It Ic getline Bo Va var Bc Ic \*[Lt] Ar file
272 to the next input record from the specified file
274 .It Ar expr Ic \&| getline
281 returns the next line of output from
284 Skip remaining patterns on this input line.
286 Skip rest of this file, open next, start at top.
287 .It Ic print Bo Ar expr-list Bc Bq Ic \*[Gt] Ar file
290 statement prints its arguments on the standard output (or to a file
296 separated by the current output field separator
298 and terminated by the
299 output record separator
305 may be literal names or parenthesized expressions; identical string values in
306 different statements denote the same open file.
307 .It Ic printf Ar format Bo Ic \&, Ar expr-list Bc Bq Ic \*[Gt] Ar file
308 Format and print its expression list according to
312 for list of supported formats and their meaning.
314 .Ss Mathematical and Numeric Functions
315 AWK has the following mathematical and numerical functions built-in:
317 .Bl -tag -width indent
319 Returns the arctangent of
325 Computes the cosine of
331 Computes the exponential value of the given argument
340 Computes the value of the natural logarithm of argument
345 Returns random number between 0 and 1.
353 Computes the non-negative square root of
358 Sets seed for random number generator (
360 and returns the previous seed.
363 AWK has the following string functions built-in:
365 .Bl -tag -width indent
366 .It Fn gensub r s h [t]
367 Search the target string
369 for matches of the regular expression
373 is a string beginning with
377 then replace all matches of
383 is a number indicating which match of
391 .\"Within the replacement text
397 .\"is a digit from 1 to 9, may be used to indicate just the text that
400 .\"parenthesized subexpression.
403 .\"represents the entire text, as does the character
409 the modified string is returned as the result of the function,
410 and the original target is
415 sequences within replacement string
421 supported at this moment.
422 .It Fn gsub r t "[s]"
425 except that all occurrences of the regular expression
430 return the number of replacements.
436 occurs, or 0 if it does not.
437 .It Fn length "[string]"
438 the length of its argument
446 where the regular expression
448 occurs, or 0 if it does not.
453 are set to the position and length of the matched string.
454 .It Fn split s a "[fs]"
464 The separation is done with the regular expression
466 or with the field separator
471 An empty string as field separator splits the string
472 into one array element per character.
473 .It Fn sprintf fmt expr "..."
474 Returns the string resulting from formatting
483 for the first occurrence of the regular expression
492 .It Fn substr s m [n]
494 .Ar n Ns No -character
502 is omitted, the rest of
508 with all upper-case characters translated to their
509 corresponding lower-case equivalents.
513 with all lower-case characters translated to their
514 corresponding upper-case equivalents.
519 provides the following two functions for obtaining time
520 stamps and formatting them:
521 .Bl -tag -width indent
523 Returns the value of time in seconds since the start of
525 Epoch (Midnight, January 1, 1970, Coordinated Universal Time).
528 .It Fn strftime "[format [, timestamp]]"
531 according to the string
534 should be in same form as value returned by
538 is missing, current time is used.
541 is missing, a default format equivalent to the output of
544 See the specification of ANSI C
546 for the format conversions which are supported.
548 .Ss Other built-in functions
549 .Bl -tag -width indent
553 and returns its exit status
556 Patterns are arbitrary Boolean combinations
558 .Ic "! || \*[Am]\*[Am]" )
559 of regular expressions and
560 relational expressions.
561 Regular expressions are as in
563 Isolated regular expressions
564 in a pattern apply to the entire line.
565 Regular expressions may also occur in
566 relational expressions, using the operators
571 is a constant regular expression;
572 any string (constant or variable) may be used
573 as a regular expression, except in the position of an isolated regular expression
576 A pattern may consist of two patterns separated by a comma;
577 in this case, the action is performed for all lines
578 from an occurrence of the first pattern
579 though an occurrence of the second.
581 A relational expression is one of the following:
582 .Bl -tag -offset indent -width indent -compact
583 .It Ar expression matchop regular-expression
584 .It Ar expression relop expression
585 .It Ar expression Ic in Ar array-name
586 .It ( Ar expr , expr,\&... Ic ") in" Ar array-name
591 is any of the six relational operators in C,
600 A conditional is an arithmetic expression,
601 a relational expression,
602 or a Boolean combination
609 may be used to capture control before the first input line is read
614 do not combine with other patterns.
615 .Ss Built-in Variables
616 Variable names with special meanings:
617 .Bl -hang -width FILENAMES
619 argument count, assignable
621 argument array, assignable;
622 non-null members are taken as filenames
624 conversion format used when converting numbers
628 array of environment variables; subscripts are names.
630 the name of the current input file
632 ordinal number of the current record in the current file
634 regular expression used to separate fields; also settable
638 number of fields in the current record
640 ordinal number of the current record
642 output format for numbers (default
646 output field separator (default blank)
648 output record separator (default newline)
650 input record separator (default newline)
652 Position of the first character matched by
656 Length of the string matched by
660 separates multiple subscripts (default 034)
663 Functions may be defined (at the position of a pattern-action statement) thus:
664 .Bd -filled -offset indent
665 .Ic function foo(a, b, c) { ...; return x }
668 Parameters are passed by value if scalar and by reference if array name;
669 functions may be called recursively.
670 Parameters are local to the function; all other variables are global.
671 Thus local variables may be created by providing excess parameters in
672 the function definition.
674 .Bl -tag -width indent -compact
675 .It Ic length($0) \*[Gt] 72
676 Print lines longer than 72 characters.
678 .It Ic \&{ print $2, $1 \&}
679 Print first two fields in opposite order.
681 .It Ic BEGIN { FS = \&",[ \et]*|[ \et]+\&" }
682 .It Ic "\ \ \ \ \ \ {" print \&$2, \&$1 }
683 Same, with input fields separated by comma and/or blanks and tabs.
685 .It Ic "\ \ \ \ {" s += $1 }
686 .It Ic END { print \&"sum is\&", s, \&" average is\ \&",\ s/NR\ }
687 Add up first column, print sum and average.
689 .It Ic /start/, /stop/
690 Print all lines between start/stop pairs.
692 .It Ic BEGIN { # Simulate echo(1)
693 .It Ic "\ \ \ \ " for (i = 1; i \*[Lt] ARGC;\ i++)\ printf\ \&"%s\ \&",\ ARGV[i]
694 .It Ic "\ \ \ \ " printf \&"\en\&"
695 .It Ic "\ \ \ \ " exit }
710 A. V. Aho, B. W. Kernighan, P. J. Weinberger,
711 .Em The AWK Programming Language ,
712 Addison-Wesley, 1988.
715 .Em AWK Language Programming ,
716 Edition 1.0, published by the Free Software Foundation, 1995
719 has been the default system
723 replacing the previously used GNU
726 There are no explicit conversions between numbers and strings.
727 To force an expression to be treated as a number add 0 to it;
728 to force it to be treated as a string concatenate
731 The scope rules for variables in functions are a botch;