I need to match <output_channels> elements which don’t contain the phrase ‘Story’ between the opening <output_channels> and closing </output_channels> tags. <output_channels> elements are never nested, so I think I should be able to do this with regex – please don’t reply that it’s impossible unless it genuinely is!
Here’s an example of the text I’ll be searching in, using either perl or vim (I find it easier to test regexes in vim):
<output_channels>
<output_channel>RSS</output_channel>
<output_channel>Story</output_channel>
</output_channels>
<output_channels>
<output_channel>RSS</output_channel>
</output_channels>
I’m thinking I need to run something like the following, but this matches both <output_channels> blocks:
<output_channels>.*?((?!Story).)*?<\/output_channels>
You need to get rid of that first
.*?. What’s happening is, after the((?!Story).)*?part correctly fails to match content withStoryin it, the regex engine backtracks and gives the.*?a crack at it, and of course it succeeds. Assuming, of course, that you’re matching in/s(single-line or dot-matches-all) mode.