1 *java.util.regex.Matcher* *Matcher* An engine that performs match operations on
3 public final class Matcher
4 extends |java.lang.Object|
5 implements |java.util.regex.MatchResult|
7 |java.util.regex.Matcher_Description|
8 |java.util.regex.Matcher_Fields|
9 |java.util.regex.Matcher_Constructors|
10 |java.util.regex.Matcher_Methods|
12 ================================================================================
14 *java.util.regex.Matcher_Methods*
15 |java.util.regex.Matcher.appendReplacement(StringBuffer,String)|Implements a no
16 |java.util.regex.Matcher.appendTail(StringBuffer)|Implements a terminal append-
17 |java.util.regex.Matcher.end()|Returns the offset after the last character matc
18 |java.util.regex.Matcher.end(int)|Returns the offset after the last character o
19 |java.util.regex.Matcher.find()|Attempts to find the next subsequence of the in
20 |java.util.regex.Matcher.find(int)|Resets this matcher and then attempts to fin
21 |java.util.regex.Matcher.group()|Returns the input subsequence matched by the p
22 |java.util.regex.Matcher.group(int)|Returns the input subsequence captured by t
23 |java.util.regex.Matcher.groupCount()|Returns the number of capturing groups in
24 |java.util.regex.Matcher.hasAnchoringBounds()|Queries the anchoring of region b
25 |java.util.regex.Matcher.hasTransparentBounds()|Queries the transparency of reg
26 |java.util.regex.Matcher.hitEnd()|Returns true if the end of input was hit by t
27 |java.util.regex.Matcher.lookingAt()|Attempts to match the input sequence, star
28 |java.util.regex.Matcher.matches()|Attempts to match the entire region against
29 |java.util.regex.Matcher.pattern()|Returns the pattern that is interpreted by t
30 |java.util.regex.Matcher.quoteReplacement(String)|Returns a literal replacement
31 |java.util.regex.Matcher.region(int,int)|Sets the limits of this matcher's regi
32 |java.util.regex.Matcher.regionEnd()|Reports the end index (exclusive) of this
33 |java.util.regex.Matcher.regionStart()|Reports the start index of this matcher'
34 |java.util.regex.Matcher.replaceAll(String)|Replaces every subsequence of the i
35 |java.util.regex.Matcher.replaceFirst(String)|Replaces the first subsequence of
36 |java.util.regex.Matcher.requireEnd()|Returns true if more input could change a
37 |java.util.regex.Matcher.reset()|Resets this matcher.
38 |java.util.regex.Matcher.reset(CharSequence)|Resets this matcher with a new inp
39 |java.util.regex.Matcher.start()|Returns the start index of the previous match.
40 |java.util.regex.Matcher.start(int)|Returns the start index of the subsequence
41 |java.util.regex.Matcher.toMatchResult()|Returns the match state of this matche
42 |java.util.regex.Matcher.toString()|Returns the string representation of this m
43 |java.util.regex.Matcher.useAnchoringBounds(boolean)|Sets the anchoring of regi
44 |java.util.regex.Matcher.usePattern(Pattern)|Changes the Pattern that this Matc
45 |java.util.regex.Matcher.useTransparentBounds(boolean)|Sets the transparency of
47 *java.util.regex.Matcher_Description*
49 An engine that performs match operations on a </code>character
50 sequence<code>(|java.lang.CharSequence|) by interpreting a
51 (|java.util.regex.Pattern|) .
53 A matcher is created from a pattern by invoking the pattern's
54 matcher(|java.util.regex.Pattern|) method. Once created, a matcher can be used
55 to perform three different kinds of match operations:
59 The matches(|java.util.regex.Matcher|) method attempts to match the entire
60 input sequence against the pattern.
62 The lookingAt(|java.util.regex.Matcher|) method attempts to match the input
63 sequence, starting at the beginning, against the pattern.
65 The find(|java.util.regex.Matcher|) method scans the input sequence looking for
66 the next subsequence that matches the pattern.
70 Each of these methods returns a boolean indicating success or failure. More
71 information about a successful match can be obtained by querying the state of
74 A matcher finds matches in a subset of its input called the region. By default,
75 the region contains all of the matcher's input. The region can be modified via
76 the region(|java.util.regex.Matcher|) method and queried via the
77 regionStart(|java.util.regex.Matcher|) and regionEnd(|java.util.regex.Matcher|)
78 methods. The way that the region boundaries interact with some pattern
79 constructs can be changed. See useAnchoringBounds(|java.util.regex.Matcher|)
80 and useTransparentBounds(|java.util.regex.Matcher|) for more details.
82 This class also defines methods for replacing matched subsequences with new
83 strings whose contents can, if desired, be computed from the match result. The
84 appendReplacement(|java.util.regex.Matcher|) and
85 appendTail(|java.util.regex.Matcher|) methods can be used in tandem in order to
86 collect the result into an existing string buffer, or the more convenient
87 replaceAll(|java.util.regex.Matcher|) method can be used to create a string in
88 which every matching subsequence in the input sequence is replaced.
90 The explicit state of a matcher includes the start and end indices of the most
91 recent successful match. It also includes the start and end indices of the
92 input subsequence captured by each capturing group in the pattern as well as a
93 total count of such subsequences. As a convenience, methods are also provided
94 for returning these captured subsequences in string form.
96 The explicit state of a matcher is initially undefined; attempting to query any
97 part of it before a successful match will cause an
98 (|java.lang.IllegalStateException|) to be thrown. The explicit state of a
99 matcher is recomputed by every match operation.
101 The implicit state of a matcher includes the input character sequence as well
102 as the append position, which is initially zero and is updated by the
103 appendReplacement(|java.util.regex.Matcher|) method.
105 A matcher may be reset explicitly by invoking its (|java.util.regex.Matcher|)
106 method or, if a new input sequence is desired, its
107 reset(CharSequence)(|java.util.regex.Matcher|) method. Resetting a matcher
108 discards its explicit state information and sets the append position to zero.
110 Instances of this class are not safe for use by multiple concurrent threads.
114 *java.util.regex.Matcher.appendReplacement(StringBuffer,String)*
116 public |java.util.regex.Matcher| appendReplacement(
117 java.lang.StringBuffer sb,
118 java.lang.String replacement)
120 Implements a non-terminal append-and-replace step.
122 This method performs the following actions:
126 It reads characters from the input sequence, starting at the append position,
127 and appends them to the given string buffer. It stops after reading the last
128 character preceding the previous match, that is, the character at index
129 (|java.util.regex.Matcher|) -1.
131 It appends the given replacement string to the string buffer.
133 It sets the append position of this matcher to the index of the last character
134 matched, plus one, that is, to (|java.util.regex.Matcher|) .
138 The replacement string may contain references to subsequences captured during
139 the previous match: Each occurrence of $g will be replaced by the result of
140 evaluating group(|java.util.regex.Matcher|) (g). The first number after the $
141 is always treated as part of the group reference. Subsequent numbers are
142 incorporated into g if they would form a legal group reference. Only the
143 numerals '0' through '9' are considered as potential components of the group
144 reference. If the second group matched the string "foo", for example, then
145 passing the replacement string "$2bar" would cause "foobar" to be appended to
146 the string buffer. A dollar sign ($) may be included as a literal in the
147 replacement string by preceding it with a backslash (\$).
149 Note that backslashes (\) and dollar signs ($) in the replacement string may
150 cause the results to be different than if it were being treated as a literal
151 replacement string. Dollar signs may be treated as references to captured
152 subsequences as described above, and backslashes are used to escape literal
153 characters in the replacement string.
155 This method is intended to be used in a loop together with the
156 appendTail(|java.util.regex.Matcher|) and find(|java.util.regex.Matcher|)
157 methods. The following code, for example, writes one dog two dogs in the yard
158 to the standard-output stream:
162 Pattern p = Pattern.compile("cat"); Matcher m = p.matcher("one cat two cats in
163 the yard"); StringBuffer sb = new StringBuffer(); while (m.find()) {
164 m.appendReplacement(sb, "dog"); } m.appendTail(sb);
165 System.out.println(sb.toString());
168 sb - The target string buffer
169 replacement - The replacement string
173 *java.util.regex.Matcher.appendTail(StringBuffer)*
175 public |java.lang.StringBuffer| appendTail(java.lang.StringBuffer sb)
177 Implements a terminal append-and-replace step.
179 This method reads characters from the input sequence, starting at the append
180 position, and appends them to the given string buffer. It is intended to be
181 invoked after one or more invocations of the
182 appendReplacement(|java.util.regex.Matcher|) method in order to copy the
183 remainder of the input sequence.
186 sb - The target string buffer
188 Returns: The target string buffer
190 *java.util.regex.Matcher.end()*
194 Returns the offset after the last character matched.
198 Returns: The offset after the last character matched
200 *java.util.regex.Matcher.end(int)*
202 public int end(int group)
204 Returns the offset after the last character of the subsequence captured by the
205 given group during the previous match operation.
207 Capturing groups are indexed from left to right, starting at one. Group zero
208 denotes the entire pattern, so the expression m.end(0) is equivalent to
212 group - The index of a capturing group in this matcher's pattern
214 Returns: The offset after the last character captured by the group, or -1 if the match
215 was successful but the group itself did not match anything
217 *java.util.regex.Matcher.find()*
219 public boolean find()
221 Attempts to find the next subsequence of the input sequence that matches the
224 This method starts at the beginning of this matcher's region, or, if a previous
225 invocation of the method was successful and the matcher has not since been
226 reset, at the first character not matched by the previous match.
228 If the match succeeds then more information can be obtained via the start, end,
233 Returns: true if, and only if, a subsequence of the input sequence matches this
236 *java.util.regex.Matcher.find(int)*
238 public boolean find(int start)
240 Resets this matcher and then attempts to find the next subsequence of the input
241 sequence that matches the pattern, starting at the specified index.
243 If the match succeeds then more information can be obtained via the start, end,
244 and group methods, and subsequent invocations of the
245 (|java.util.regex.Matcher|) method will start at the first character not
246 matched by this match.
250 Returns: true if, and only if, a subsequence of the input sequence starting at the given
251 index matches this matcher's pattern
253 *java.util.regex.Matcher.group()*
255 public |java.lang.String| group()
257 Returns the input subsequence matched by the previous match.
259 For a matcher m with input sequence s, the expressions m.group() and
260 s.substring(m.start(),m.end()) are equivalent.
262 Note that some patterns, for example a*, match the empty string. This method
263 will return the empty string when the pattern successfully matches the empty
268 Returns: The (possibly empty) subsequence matched by the previous match, in string form
270 *java.util.regex.Matcher.group(int)*
272 public |java.lang.String| group(int group)
274 Returns the input subsequence captured by the given group during the previous
277 For a matcher m, input sequence s, and group index g, the expressions
278 m.group(g) and s.substring(m.start(g),m.end(g)) are equivalent.
280 Capturing groups are indexed from left to right, starting at one. Group zero
281 denotes the entire pattern, so the expression m.group(0) is equivalent to
284 If the match was successful but the group specified failed to match any part of
285 the input sequence, then null is returned. Note that some groups, for example
286 (a*), match the empty string. This method will return the empty string when
287 such a group successfully matches the empty string in the input.
290 group - The index of a capturing group in this matcher's pattern
292 Returns: The (possibly empty) subsequence captured by the group during the previous
293 match, or null if the group failed to match part of the input
295 *java.util.regex.Matcher.groupCount()*
297 public int groupCount()
299 Returns the number of capturing groups in this matcher's pattern.
301 Group zero denotes the entire pattern by convention. It is not included in this
304 Any non-negative integer smaller than or equal to the value returned by this
305 method is guaranteed to be a valid group index for this matcher.
309 Returns: The number of capturing groups in this matcher's pattern
311 *java.util.regex.Matcher.hasAnchoringBounds()*
313 public boolean hasAnchoringBounds()
315 Queries the anchoring of region bounds for this matcher.
317 This method returns true if this matcher uses anchoring bounds, false
320 See useAnchoringBounds(|java.util.regex.Matcher|) for a description of
323 By default, a matcher uses anchoring region boundaries.
327 Returns: true iff this matcher is using anchoring bounds, false otherwise.
329 *java.util.regex.Matcher.hasTransparentBounds()*
331 public boolean hasTransparentBounds()
333 Queries the transparency of region bounds for this matcher.
335 This method returns true if this matcher uses transparent bounds, false if it
338 See useTransparentBounds(|java.util.regex.Matcher|) for a description of
339 transparent and opaque bounds.
341 By default, a matcher uses opaque region boundaries.
345 Returns: true iff this matcher is using transparent bounds, false otherwise.
347 *java.util.regex.Matcher.hitEnd()*
349 public boolean hitEnd()
351 Returns true if the end of input was hit by the search engine in the last match
352 operation performed by this matcher.
354 When this method returns true, then it is possible that more input would have
355 changed the result of the last search.
359 Returns: true iff the end of input was hit in the last match; false otherwise
361 *java.util.regex.Matcher.lookingAt()*
363 public boolean lookingAt()
365 Attempts to match the input sequence, starting at the beginning of the region,
368 Like the matches(|java.util.regex.Matcher|) method, this method always starts
369 at the beginning of the region; unlike that method, it does not require that
370 the entire region be matched.
372 If the match succeeds then more information can be obtained via the start, end,
377 Returns: true if, and only if, a prefix of the input sequence matches this matcher's
380 *java.util.regex.Matcher.matches()*
382 public boolean matches()
384 Attempts to match the entire region against the pattern.
386 If the match succeeds then more information can be obtained via the start, end,
391 Returns: true if, and only if, the entire region sequence matches this matcher's pattern
393 *java.util.regex.Matcher.pattern()*
395 public |java.util.regex.Pattern| pattern()
397 Returns the pattern that is interpreted by this matcher.
401 Returns: The pattern for which this matcher was created
403 *java.util.regex.Matcher.quoteReplacement(String)*
405 public static |java.lang.String| quoteReplacement(java.lang.String s)
407 Returns a literal replacement String for the specified String.
409 This method produces a String that will work as a literal replacement s in the
410 appendReplacement method of the (|java.util.regex.Matcher|) class. The String
411 produced will match the sequence of characters in s treated as a literal
412 sequence. Slashes ('\') and dollar signs ('$') will be given no special
416 s - The string to be literalized
418 Returns: A literal string replacement
420 *java.util.regex.Matcher.region(int,int)*
422 public |java.util.regex.Matcher| region(
426 Sets the limits of this matcher's region. The region is the part of the input
427 sequence that will be searched to find a match. Invoking this method resets the
428 matcher, and then sets the region to start at the index specified by the start
429 parameter and end at the index specified by the end parameter.
431 Depending on the transparency and anchoring being used (see
432 useTransparentBounds(|java.util.regex.Matcher|) and
433 useAnchoringBounds(|java.util.regex.Matcher|) ), certain constructs such as
434 anchors may behave differently at or around the boundaries of the region.
437 start - The index to start searching at (inclusive)
438 end - The index to end searching at (exclusive)
442 *java.util.regex.Matcher.regionEnd()*
444 public int regionEnd()
446 Reports the end index (exclusive) of this matcher's region. The searches this
447 matcher conducts are limited to finding matches within
448 regionStart(|java.util.regex.Matcher|) (inclusive) and
449 regionEnd(|java.util.regex.Matcher|) (exclusive).
453 Returns: the ending point of this matcher's region
455 *java.util.regex.Matcher.regionStart()*
457 public int regionStart()
459 Reports the start index of this matcher's region. The searches this matcher
460 conducts are limited to finding matches within
461 regionStart(|java.util.regex.Matcher|) (inclusive) and
462 regionEnd(|java.util.regex.Matcher|) (exclusive).
466 Returns: The starting point of this matcher's region
468 *java.util.regex.Matcher.replaceAll(String)*
470 public |java.lang.String| replaceAll(java.lang.String replacement)
472 Replaces every subsequence of the input sequence that matches the pattern with
473 the given replacement string.
475 This method first resets this matcher. It then scans the input sequence looking
476 for matches of the pattern. Characters that are not part of any match are
477 appended directly to the result string; each match is replaced in the result by
478 the replacement string. The replacement string may contain references to
479 captured subsequences as in the appendReplacement(|java.util.regex.Matcher|)
482 Note that backslashes (\) and dollar signs ($) in the replacement string may
483 cause the results to be different than if it were being treated as a literal
484 replacement string. Dollar signs may be treated as references to captured
485 subsequences as described above, and backslashes are used to escape literal
486 characters in the replacement string.
488 Given the regular expression a*b, the input "aabfooaabfooabfoob", and the
489 replacement string "-", an invocation of this method on a matcher for that
490 expression would yield the string "-foo-foo-foo-".
492 Invoking this method changes this matcher's state. If the matcher is to be used
493 in further matching operations then it should first be reset.
496 replacement - The replacement string
498 Returns: The string constructed by replacing each matching subsequence by the
499 replacement string, substituting captured subsequences as needed
501 *java.util.regex.Matcher.replaceFirst(String)*
503 public |java.lang.String| replaceFirst(java.lang.String replacement)
505 Replaces the first subsequence of the input sequence that matches the pattern
506 with the given replacement string.
508 This method first resets this matcher. It then scans the input sequence looking
509 for a match of the pattern. Characters that are not part of the match are
510 appended directly to the result string; the match is replaced in the result by
511 the replacement string. The replacement string may contain references to
512 captured subsequences as in the appendReplacement(|java.util.regex.Matcher|)
515 Note that backslashes (\) and dollar signs ($) in the replacement string may
516 cause the results to be different than if it were being treated as a literal
517 replacement string. Dollar signs may be treated as references to captured
518 subsequences as described above, and backslashes are used to escape literal
519 characters in the replacement string.
521 Given the regular expression dog, the input "zzzdogzzzdogzzz", and the
522 replacement string "cat", an invocation of this method on a matcher for that
523 expression would yield the string "zzzcatzzzdogzzz".
525 Invoking this method changes this matcher's state. If the matcher is to be used
526 in further matching operations then it should first be reset.
529 replacement - The replacement string
531 Returns: The string constructed by replacing the first matching subsequence by the
532 replacement string, substituting captured subsequences as needed
534 *java.util.regex.Matcher.requireEnd()*
536 public boolean requireEnd()
538 Returns true if more input could change a positive match into a negative one.
540 If this method returns true, and a match was found, then more input could cause
541 the match to be lost. If this method returns false and a match was found, then
542 more input might change the match but the match won't be lost. If a match was
543 not found, then requireEnd has no meaning.
547 Returns: true iff more input could change a positive match into a negative one.
549 *java.util.regex.Matcher.reset()*
551 public |java.util.regex.Matcher| reset()
555 Resetting a matcher discards all of its explicit state information and sets its
556 append position to zero. The matcher's region is set to the default region,
557 which is its entire character sequence. The anchoring and transparency of this
558 matcher's region boundaries are unaffected.
564 *java.util.regex.Matcher.reset(CharSequence)*
566 public |java.util.regex.Matcher| reset(java.lang.CharSequence input)
568 Resets this matcher with a new input sequence.
570 Resetting a matcher discards all of its explicit state information and sets its
571 append position to zero. The matcher's region is set to the default region,
572 which is its entire character sequence. The anchoring and transparency of this
573 matcher's region boundaries are unaffected.
576 input - The new input character sequence
580 *java.util.regex.Matcher.start()*
584 Returns the start index of the previous match.
588 Returns: The index of the first character matched
590 *java.util.regex.Matcher.start(int)*
592 public int start(int group)
594 Returns the start index of the subsequence captured by the given group during
595 the previous match operation.
597 Capturing groups are indexed from left to right, starting at one. Group zero
598 denotes the entire pattern, so the expression m.start(0) is equivalent to
602 group - The index of a capturing group in this matcher's pattern
604 Returns: The index of the first character captured by the group, or -1 if the match was
605 successful but the group itself did not match anything
607 *java.util.regex.Matcher.toMatchResult()*
609 public |java.util.regex.MatchResult| toMatchResult()
611 Returns the match state of this matcher as a (|java.util.regex.MatchResult|) .
612 The result is unaffected by subsequent operations performed upon this matcher.
616 Returns: a MatchResult with the state of this matcher
618 *java.util.regex.Matcher.toString()*
620 public |java.lang.String| toString()
622 Returns the string representation of this matcher. The string representation of
623 a Matcher contains information that may be useful for debugging. The exact
624 format is unspecified.
628 Returns: The string representation of this matcher
630 *java.util.regex.Matcher.useAnchoringBounds(boolean)*
632 public |java.util.regex.Matcher| useAnchoringBounds(boolean b)
634 Sets the anchoring of region bounds for this matcher.
636 Invoking this method with an argument of true will set this matcher to use
637 anchoring bounds. If the boolean argument is false, then non-anchoring bounds
640 Using anchoring bounds, the boundaries of this matcher's region match anchors
643 Without anchoring bounds, the boundaries of this matcher's region will not
644 match anchors such as ^ and $.
646 By default, a matcher uses anchoring region boundaries.
649 b - a boolean indicating whether or not to use anchoring bounds.
653 *java.util.regex.Matcher.usePattern(Pattern)*
655 public |java.util.regex.Matcher| usePattern(java.util.regex.Pattern newPattern)
657 Changes the Pattern that this Matcher uses to find matches with.
659 This method causes this matcher to lose information about the groups of the
660 last match that occurred. The matcher's position in the input is maintained and
661 its last append position is unaffected.
664 newPattern - The new pattern used by this matcher
668 *java.util.regex.Matcher.useTransparentBounds(boolean)*
670 public |java.util.regex.Matcher| useTransparentBounds(boolean b)
672 Sets the transparency of region bounds for this matcher.
674 Invoking this method with an argument of true will set this matcher to use
675 transparent bounds. If the boolean argument is false, then opaque bounds will
678 Using transparent bounds, the boundaries of this matcher's region are
679 transparent to lookahead, lookbehind, and boundary matching constructs. Those
680 constructs can see beyond the boundaries of the region to see if a match is
683 Using opaque bounds, the boundaries of this matcher's region are opaque to
684 lookahead, lookbehind, and boundary matching constructs that may try to see
685 beyond them. Those constructs cannot look past the boundaries so they will fail
686 to match anything outside of the region.
688 By default, a matcher uses opaque bounds.
691 b - a boolean indicating whether to use opaque or transparent regions