I’m having an issue with my regex.
I want to capture <% some stuff %> and i need what’s inside the <% and the %>
This regex works quite well for that.
$matches = preg_split('/<%[\s]*(.*?)[\s]*%>/i',$markup,-1,(PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE));
I also want to catch &% some stuff %&gt; so I need to capture <% or &lt;% and %> or %&gt; respectively.
If I put in a second set of parens, it makes preg_split function differently (because as you can see from the flag, I’m trying to capture what’s inside the parens.
Preferably, it would only match &lt; to &gt; and < to > as well, but that’s not completely necessary
EDIT: The SUBJECT may contain multiple matches, and I need all of them
In your case, it’s better to use preg_match with its additional parameter and parenthesis:
By the way, check this online tool to debug PHP regexp, it’s so useful !
http://regex.larsolavtorvik.com/
EDIT : I hacked the regexp a bit so it’s faster. Tested it, it works 🙂
Now let’s explain all that stuff :
The patten in details :
Why do we use [^ø] instead of . ? It’s because . is very time consuming, the regexp engine will check among all the existing characters. [^ø] just check if the char is not ø. Nobody uses ø, it’s an international money symbol, but if you care, you can replace it by chr(7) wich is the shell bell char that’s obviously will never be typed in a web page.
EDIT2 : I just read your edit about capturing all the matches. In that case, you´ll use preg_match_all the same way.