Using preg_match with subpattern always returns double-key array with identical data, one with subpattern name and the other tagged with number. Because I’m matching hundred thousands of lines with few kbytes per row, I’m afraid the number array is occupying extra memory. Is there any proper way to disable the number tag array from returning?
Example:
<?php
header('Content-Type: text/plain');
$data = <<<START
I go to school.
He goes to funeral.
START;
preg_match_all('@^(?<who>.*?) go(es)* to (?<place>.*?)$@m', $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => I go to school.
[1] => He goes to funeral.
)
[who] => Array
(
[0] => I
[1] => He
)
[1] => Array
(
[0] => I
[1] => He
)
[2] => Array
(
[0] =>
[1] => es
)
[place] => Array
(
[0] => school.
[1] => funeral.
)
[3] => Array
(
[0] => school.
[1] => funeral.
)
)
From php.net- Subpatterns
I see no option to give only the index by name.
So, I think, if you don’t want this data two times, the only possibility is: don’t use named groups.
Is this really an issue? IMO optimize this only if you run into problems, because of this additional memory usage! The improved readability should be worth the memory!
Update
It look like
go(es)*should only match an optional "es". Here you can save memory by using a non capturing group.by starting the group with
?:the matched content is not stored. I also replaced the*that means 0 or more and would also match "goeseses" with the?which means 0 or 1.