Possible Duplicate:
Regex – nested patterns – within outer pattern but exclude inner pattern
I am trying to get a substring of a string after/from a word. But I want that word to be outside of the parenthesis. For example:
something (theword other things) theword some more stuff should give me theword some more stuff instead of theword other things) theword more stuff. How can I do this in regular expressions. I am using PCRE (e.g. php, python regex engine)
Edit:
The string I am trying to use this regular expression on is a mysql statement. I am trying to remove parts until FROM part, but inner sql statements (that are in parenthesis causing problems to me).
This is a task similar to the one described in this question: Regex – nested patterns – within outer pattern but exclude inner pattern
See my answer for why it’s not 100% possible and for a hack-ish solution that works most of the times (for shallow nesting, that is).
Update:
If you know that parentheses wil neither be nested, nor escaped, you could use something akin to this:
So with this as the haystack:
Wou would end up with these snippets in capture group 1:
Note that there are leading/trailing spaces captured as well, to prevent them use this instead:
Same regex pattern, but with comments: