A while ago, I saw in regex (at least in PHP) you can make a capturing group not capture by prepending ?:.
Example
$str = 'big blue ball';
$regex = '/b(ig|all)/';
preg_match_all($regex, $str, $matches);
var_dump($matches);
Outputs…
array(2) {
[0]=>
array(2) {
[0]=>
string(3) "big"
[1]=>
string(4) "ball"
}
[1]=>
array(2) {
[0]=>
string(2) "ig"
[1]=>
string(3) "all"
}
}
In this example, I don’t care about what was matched in the parenthesis, so I appended the ?: ('/b(?:ig|all)/') and got output
array(1) {
[0]=>
array(2) {
[0]=>
string(3) "big"
[1]=>
string(4) "ball"
}
}
This is very useful – at least I think so. Sometimes you just don’t want to clutter your matches with unnecessary values.
I was trying to look up documentation and the official name for this (I call it a non capturing group, but I think I’ve heard it before).
Being symbols, it seemed hard to Google for.
I have also looked at a number of regex reference guides, with no mention.
Being prefixed with ?, and appearing in the first chars inside parenthesis would leave me to believe it has something to do with lookaheads or lookbehinds.
So, what is the proper name for these, and where can I learn more?
It’s available on the Subpatterns page of the official documentation.
It’s also good to note that you can set options for the subpattern with it. For example, if you want only the sub-pattern to be case insensitive, you can do:
Will match:
But not
Oh, and while the official documentation doesn’t actually explicitly name the syntax, it does refer to it later on as a "non-capturing subpattern" (which makes complete sense, and is what I would call it anyway, since it’s not really a "group", but a subpattern)…