I am trying to match inputs like
<foo>
<bar>
#####<foo>
#####<bar>
I tried #{5}?<\w+>, but it does not match <foo> and <bar>.
What’s wrong with this pattern, and how can it be fixed?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
On
?for optional vs reluctantThe
?metacharacter in Java regex (and some other flavors) can have two very different meanings, depending on where it appears. Immediately following a repetition specifier,?is a reluctant quantifier instead of “zero-or-one”/”optional” repetition specifier.Thus,
#{5}?does not mean “optionally match 5#“. It in fact says “match 5#reluctantly”. It may not make too much sense to try to match “exactly 5, but as few as possible”, but this is in fact what this pattern means.Grouping to the rescue!
One way to fix this problem is to group the optional pattern as
(…)?. Something like this should work for this problem:Now the
?does not immediately follow a repetition specifier (i.e.*,+,?, or{…}); it follows a closing bracket used for grouping.Alternatively, you can also use a non-capturing group
(?:…)in this case:This achieves the same grouping effect, but doesn’t capture into
\1.References
java.util.regex.Pattern:X{n}?: X, exactly n timesRelated questions
regex{n,}?==regex{n}? (absolutely NOT!).*?and.*for regexBonus material: What about
??It’s worth noting that you can use
??to match an optional item reluctantly!Note that
Z??is an optionalZ, but it’s matched reluctantly."NOMZ"in its entirety stillmatchesthe patternNOMZ??, but inreplaceAll,NOMZ??can match only"NOM"and doesn’t have to take the optionalZeven if it’s there.By contrast,
NOMZ?will match the optionalZgreedily: if it’s there, it’ll take it.Related questions
matchesa pattern against the entireString