2 .\" Copyright 1989 AT&T Copyright (c) 1996, Sun Microsystems, Inc. All Rights Reserved
3 .\" The contents of this file are subject to the terms of the Common Development and Distribution License (the "License"). You may not use this file except in compliance with the License.
4 .\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE or http://www.opensolaris.org/os/licensing. See the License for the specific language governing permissions and limitations under the License.
5 .\" When distributing Covered Code, include this CDDL HEADER in each file and include the License file at usr/src/OPENSOLARIS.LICENSE. If applicable, add the following below this CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your own identifying information: Portions Copyright [yyyy] [name of copyright owner]
6 .TH REGEXPR 3GEN "Dec 29, 1996"
8 regexpr, compile, step, advance \- regular expression compile and match
13 \fBcc\fR [\fIflag\fR]... [\fIfile\fR]... \fB-lgen\fR [\fIlibrary\fR]...
20 \fBchar *\fR\fBcompile\fR(\fBchar *\fR\fIinstring\fR, \fBchar *\fR\fIexpbuf\fR, \fBconst char *\fR\fIendbuf\fR);
26 \fBstep\fR(\fBconst char *\fR\fIstring\fR, \fBconst char *\fR\fIexpbuf\fR);
32 \fBadvance\fR(\fBconst char *\fR\fIstring\fR, \fBconst char *\fR\fIexpbuf\fR);
37 \fBextern char *\fRloc1\fB, \fRloc2\fB, \fRlocs\fB;\fR
42 \fBextern int \fRnbra\fB, \fRregerrno\fB, \fRreglength\fB;\fR
47 \fBextern char *\fRbraslist\fB[], *\fRbraelist\fB[];\fR
53 These routines are used to compile regular expressions and match the compiled
54 expressions against lines. The regular expressions compiled are in the form
58 The parameter \fIinstring\fR is a null-terminated string representing the
62 The parameter \fIexpbuf\fR points to the place where the compiled regular
63 expression is to be placed. If \fIexpbuf\fR is \fINULL\fR, \fBcompile()\fR
64 uses \fBmalloc\fR(3C) to allocate the space for the compiled regular
65 expression. If an error occurs, this space is freed. It is the user's
66 responsibility to free unneeded space after the compiled regular expression is
70 The parameter \fIendbuf\fR is one more than the highest address where the
71 compiled regular expression may be placed. This argument is ignored if
72 \fIexpbuf\fR is \fINULL\fR. If the compiled expression cannot fit in
73 (\fIendbuf\fR\(mi\fIexpbuf\fR) bytes, \fBcompile()\fR returns \fINULL\fR and
74 \fBregerrno\fR (see below) is set to 50.
77 The parameter \fIstring\fR is a pointer to a string of characters to be
78 checked for a match. This string should be null-terminated.
81 The parameter \fIexpbuf\fR is the compiled regular expression obtained by a
82 call of the function \fBcompile()\fR.
85 The function \fBstep()\fR returns non-zero if the given string matches the
86 regular expression, and zero if the expressions do not match. If there is a
87 match, two external character pointers are set as a side effect to the call to
88 \fBstep()\fR. The variables set in \fBstep()\fR are \fIloc1\fR and \fIloc2\fR.
89 \fIloc1\fR is a pointer to the first character that matched the regular
90 expression. The variable \fIloc2\fR points to the character after the last
91 character that matches the regular expression. Thus if the regular expression
92 matches the entire line, \fIloc1\fR points to the first character of
93 \fIstring\fR and \fIloc2\fR points to the null at the end of \fIstring\fR.
96 The purpose of \fBstep()\fR is to step through the \fIstring\fR argument until
97 a match is found or until the end of \fIstring\fR is reached. If the regular
98 expression begins with \fB^\fR, \fBstep()\fR tries to match the regular
99 expression at the beginning of the string only.
102 The \fBadvance()\fR function is similar to \fBstep()\fR; but, it only sets the
103 variable \fIloc2\fR and always restricts matches to the beginning of the
107 If one is looking for successive matches in the same string of characters,
108 \fBlocs\fR should be set equal to \fIloc2\fR, and \fBstep()\fR should be called
109 with \fIstring\fR equal to \fIloc2\fR. \fIlocs\fR is used by commands like
110 \fBed\fR and \fBsed\fR so that global substitutions like \fBs/y*//g\fR do not
111 loop forever, and is \fINULL\fR by default.
114 The external variable \fBnbra\fR is used to determine the number of
115 subexpressions in the compiled regular expression. \fBbraslist\fR and
116 \fBbraelist\fR are arrays of character pointers that point to the start and end
117 of the \fBnbra\fR subexpressions in the matched string. For example, after
118 calling \fBstep()\fR or \fBadvance()\fR with string \fBsabcdefg\fR and regular
119 expression \fB\e(abcdef\e)\fR, \fBbraslist[0]\fR will point at \fBa\fR and
120 \fBbraelist[0]\fR will point at \fBg\fR. These arrays are used by commands like
121 \fBed\fR and \fBsed\fR for substitute replacement patterns that contain the
122 \fB\e\fR\fIn\fR notation for subexpressions.
125 Note that it is not necessary to use the external variables \fBregerrno\fR,
126 \fBnbra\fR, \fBloc1\fR, \fBloc2\fR \fBlocs\fR, \fBbraelist\fR, and
127 \fBbraslist\fR if one is only checking whether or not a string matches a
131 \fBExample 1 \fRThe following is similar to the regular expression code from
138 if(compile(*argv, (char *)0, (char *)0) == (char *)0)
141 if (step(linebuf, expbuf))
149 If \fBcompile()\fR succeeds, it returns a non-\fINULL\fR pointer whose value
150 depends on \fIexpbuf\fR. If \fIexpbuf\fR is non-\fINULL\fR, \fBcompile()\fR
151 returns a pointer to the byte after the last byte in the compiled regular
152 expression. The length of the compiled regular expression is stored in
153 \fBreglength\fR. Otherwise, \fBcompile()\fR returns a pointer to the space
154 allocated by \fBmalloc\fR(3C).
157 The functions \fBstep()\fR and \fBadvance()\fR return non-zero if the given
158 string matches the regular expression, and zero if the expressions do not
163 If an error is detected when compiling the regular expression, a \fINULL\fR
164 pointer is returned from \fBcompile()\fR and \fBregerrno\fR is set to one of
165 the non-zero error numbers indicated below:
173 11 Range endpoint too large.
175 25 "\edigit" out or range.
176 36 Illegal or missing delimiter.
177 41 No remembered string search.
178 42 \e(~\e) imbalance.
180 44 More than 2 numbers given in \e[~\e}.
181 45 } expected after \e.
182 46 First number exceeds second in \e{~\e}.
184 50 Regular expression overflow.
190 See \fBattributes\fR(5) for descriptions of the following attributes:
198 ATTRIBUTE TYPE ATTRIBUTE VALUE
206 \fBed\fR(1), \fBgrep\fR(1), \fBsed\fR(1), \fBmalloc\fR(3C),
207 \fBattributes\fR(5), \fBregexp\fR(5)
211 When compiling multi-threaded applications, the \fB_REENTRANT\fR flag must be
212 defined on the compile line. This flag should only be used in multi-threaded