I need to match the following sets of input:
foo_abc_bar
foo_bar
and get “abc” or an empty string as the result.
So this is the regular expression I wrote:
r'foo_(abc|)[_|]bar'
But for some reason, this does not match with the second string that I have given.
On further inspection, I found that [_|] does not match an empty string.
So, how do I solve this problem?
To make
abc_optional, you could use the question mark operator:Thus, the entire regex becomes:
With this regex, the second underscore (if present) will become part of the capture group. If you don’t want that, you could either remove it post-match with
.rstrip('_')or use a slightly more complex regex:That’s right. Square brackets denote a character group. The
[_|]would match exactly one underscore or exactly one vertical bar, and nothing else. In other words, the vertical bar loses its special meaning when it appears inside a character group.