I’m trying to write a regex that matches xa?b?c? but not x. In reality, ‘x’, ‘a’, ‘b’, and ‘c’ are not single characters, they are moderately complex sub-expressions, so I’m trying to avoid something like x(abc|ab|ac|bc|a|b|c). Is there a simple way to match “at least one of a, b, and c, in that order” in a regex, or am I out of luck?
Share
Here’s the shortest version:
If you need to keep around the match in a separate group, write this:
But that isn’t very robust in case
a,b, orccontain capture groups. So instead write this:And if you need a group for the whole match, then write this:
And if like me you prefer multi-lettered identifiers and also think this sort of thing is insane without being in
/xmode, write this:And here is the full testing program to prove that those all work:
All five versions produce this output:
Sweet, eh?
EDIT: For the
xin the beginning part, just put whateverxyou want at the start of the match, before the very first optional capture group for theapart, so like this:or like this
The test sentence was constructed without the
xpart, so it won’t work for that, but I think I’ve shown how I mean to go at this. Note that all ofx,a,b, andccan be arbitrarily complex patterns (yes, even recursive), not merely single letters, and it doesn’t matter if they use numbered capture groups of their own, even.If you want to go at this with lookaheads, you can do this:
And here is what to add to the
@patsarray in the test program to show that this approach also works:You’ll notice please that I still manage never to repeat any of
a,b, orc, even with the lookahead technique.Do I win? ☺