Let’s say we have the following expression
(?x) ((a b) (c) | (d) (e f) | (g) h (i) )
We’ll get the following backreferences:
\1 abc, def, ghi
\2 ab
\3 c
\4 d
\5 ef
\6 g
\7 i
Notepad++ regex engine supports a wonderful construct:
(?|expression using the alternation | operator)
That expression makes subexpression counter not to be altered by what is in the other branches of the alternation. An it works perfectly fine. So
(?x) (?| (a b) (c) | (d) (e f) | (g) h (i) )
creates the following groups
\1 ab, d, g
\2 c, ef, i
But when I tried to use this construct in PHP I got an error.
Warning: preg_match_all(): Compilation failed: unrecognized character after (? or (?- at offset 11 in …
So, is it possible in PHP to have the same numbering for different branches of the alternation (as in the before-mentioned construct)?
The solution
The problem was that I put a line break between ‘?’ and ‘!’. I used ‘x’ modifier and thought it made possible using linebreaks anywhere in the expression. But as it turned out you can’t break "?|" part regardless of modifiers used.
The docs for subpatterns in PCRE say it’s supported. There’s an example at the bottom of that page.
Edit: Just tried it in PHP 5.3.15 and it works.