I’ve got a regular expression that looks something like this:
a(|bc)
this expression matches perfectly a String “a”, but it doesnt match “abc”. What does the expression in the parenthesis mean?
Edit:
Using C# with the following code:
Match m = Regex.Match(TxtTest.Text, TxtRegex.Text);
if (m.Success)
RtfErgebnis.Text = m.Value;
else
RtfErgebnis.Text = "Gültig, aber kein Match!";
“TxTTest” contains the string to test (in this case “abc”).
“TxtRegex” contains the regular expression (in this case “a(|bc)”)
“RtfErgebnis” shows “Gültig, aber kein Match!” which means, the regex is valid but the given teststring did not match.
On a side note:
The expression
a(|bc)d
matches “ad” aswell as “abcd”. So why does the previous expression not match “abc”?
I have no influence on the regular expression I will get. I just stumbled upon this special case. I need to know how to handle it for regex parsing and data generation.
Edit 2:
“RtfErgebnis” shows “Gültig, aber kein
Match!” which means, the regex is
valid but the given teststring did not
match.
I had a little error on the parameters passed, so now it shows “a”, which is completely right.
The pipe means “or”. Your first expressions says “a, followed by nothing or bc“. Hence, “a” is a full match, and it doesn’t bother to include “bc”.
The second expression says “a, followed by nothing or bc, followed by d”. In that version, a match is only complete when it selects everything all the way trough to “d”.
If you want it to prefer the “bc” option over the nothing option, you could rewrite your expression as such:
which means, “a, followed by zero or one occurrence of bc”, in which case most engines will treat “abc”, rather than, “a”, as the full match.