I have two strings which contain up to 3 elements:
1) anychar[price]{alphanum} e.g. a1\')[=00.00]{a1234}
2) anychar:anychar{alphanum} e.g. a1\'):a2\'){a1234}
…but the {} element is optional and may not always be there. I wrote the following patterns (respectively):
1) /(.+)\[(.+)\]\{*(\w+)*\}*/ – works as expected
2) /(.+)\:(.+)\{*(\w+)*\}*/ – works fine if the {} element is removed, but not with it.
The result array for 2 is as follows:
(
[0] => a1\'):a2\'){a123}
[1] => a1\')
[2] => a2\'){a123}
)
I’ve tried a few different permutations of the above but no dice. Any ideas?
First you should remove the * after {, } and (\w+).
Gives
* means either 0, 1 or several, and PCRE tries to find the quickest route it can, so if you make the whole third part optional (by using * everywhere) then the quickest route is to have everything included in the second group and skip the third, that’s why your code didn’t work.
Now in order to deal with the fact that the third part is optional, you have to use a positive lookahead: in the second group, you will ask pcre to select it only if it can matches another regex after it. The final regex is this:
What I changed is that:
inside the second group, i added a positive lookahead in the form (?=regex). As said, this means it has to match. Lookahead are not selective by default, which means that they don’t create a entry in your final result/they are not returned to you.
inside that lookahead, I created two cases, which means that in order to match, the .+ from the second group will have to match either case of my lookahead.
The first case is very basic, it means end of string not preceded by a }, this will match the string when the 3rd part is not there
the second case if you selector for the 3rd group, we make it selectable so that it will be returned in the results if present