I am trying to create a regular expression (in PHP) that matches any of these texts:
#{text}
#{text1}{text2}
#{text1}{numbers}{text2}
#{text1}{text with spaces}{numbers}{text2}
And so. Basically the first block could only hold text without spaces, yet the rest could hold anything. And after that, getting as matches those text1…numbers, etc. I’ve been trying with some regexp, yet it didn’t make it. Here’s the last one:
/#{(\w+)}({([\ a-zA-Z0-9*])})*/U
Thanks in advance!
EDIT: Just like @stema suggested, I changed my regexp to this one:
/#\{(\w+)\}(\{([^}]*)\})*/
I avoided the Ungreedy flag because it was not helping the expression at all :). However, the results are not as many as I needed:
array(4) {
[0]=>
string(42) "#{text1}{text with spaces}{numbers}{text2}"
[1]=>
string(5) "text1"
[2]=>
string(7) "{text2}"
[3]=>
string(5) "text2"
}
It seems that the inbetween parameters are not parsed (which looks weird to me).
The main issue I see is that the quantifier is misplaced
should be outside the character class
If the content in the following braces could be anything than you could do this
[^}]is a negated character class that matches anything but the closing curly bracket.I also escaped the curly braces, since they have a special meaning as part of a quantifier. Some languages will match them literally when they do not form such a quantifier, but for clarity its better to always escape them, when they should be matched.
Update:
You can make one of your groups an non-capturing group, since you don’t seem to need it
this should give you this result
but you will always get only the last match of the repeated group in your resulting array, because each match is stored at
array[2]. The second match will overwrite the first, the third the second …What you could do is to use the regex for format validation and then do a split e.g. something like this