Possible Duplicate:
String.replaceAll() anomaly with greedy quantifiers in regex
I was writing code that uses Matcher#replaceAll and found following result highly confusing:
Pattern.compile("(.*)").matcher("sample").replaceAll("$1abc");
Now, I would expect the output to be sampleabc but Java throws at me sampleabcabc.
Does anybody have any ideas why?
Now, sure, when I anchor the pattern (^(.*)$) the issue goes away. Still I don’t know why the hell would replaceAll do a double replacement like that.
And to add insult to injury, following code:
Pattern.compile("(.*)").matcher("sample").replaceFirst("$1abc")
works as expected, returning just sampleabc.
It looks like it’s matching the empty string at the end of the input, for some reason. (I can see why it would match; I’m intrigued that it matches once and only once.)
If you change
replaceAll("$1abc")toreplaceAll("'$1'abc")the result is'sample'abc''abc.Note that if you change
(.*)to(.+)then it works correctly, because it has to match at least one character.The diagnosis is confirmed by this code:
… which outputs: