Pattern pattern = Pattern.compile("a?");
Matcher matcher = pattern.matcher("a");
while(matcher.find()){
System.out.println(matcher.start()+"["+matcher.group()+"]"+matcher.end());
}
Output :
0[a]1
1[]1
why this gives me two outputs while there is a single characters as the matcher.
I noticed that for this pattern it gives an zero-length always at the end of the source string.
Eg : when source is “abab” it gives
0[a]1
1[]1
2[a]3
3[]3
4[]4
The regex special character
?(question mark) means “match the preceding thing zero or one time”.Since you are matching in a while loop (
while (matcher.find()) {...) it finds both matches of the expression – one occurrence of “a” (at position 0, the string “a”) and zero occurrences of “a” (at position 1, the empty string at the very end).So here’s what your code snippet is matching (start/end indices are denoted by
X/Y):It doesn’t match at positions 0/0 or 2/2 since the expression is greedy, which means it will try to consider the next character (at positions 0/1, 2/3) as long as it doesn’t invalidate the match, which it doesn’t so they are skipped. To illustrate, if you were to match the string
"bbbb"against the patterna?then you would get five empty strings, one for each empty string at the beginning, end, and between each character.