OK, so I opened up this question yesterday and got an answer fairly quickly. It worked, or so I thought, so I marked it as the correct answer.
However I don’t think I explained the situation very well. Basically I am getting the HTML right before it is rendered, parsing it and searching for strings matching the pattern [tag|text x], where x is a number and the two words are case-insensitive.
However, as stated in the previous question, I would like to NOT replace these tags if they’re inside a textarea. This means that if they’re between </textarea> and <textarea...> then I would still like to replace them, but if they’re between <textarea...> and </textarea> then I would NOT like to replace them.
So far I have
@"(?<!\<textarea class='tag'\>)\[(tag|text) ([0-9]+)\]"
I have tried
@"(?<!\<textarea.[^>]*\>)\[(tag|text) ([0-9]+)\]"
but that doesn’t appear to work either.
For example I would like to replace any tags outside of the textareas in the following:
[tag 1]
<textarea>[tag 2]</textarea>[tag 3]
<textarea class="bob">Walter [tag 4]</textarea>[tag 5]
<textarea attr-1="fred">Jim [tag 6] Mary</textarea>[tag 7]
[tag 8]
In this example only tags 1, 3, 5, 7 and 8 should be replaced; 2, 4 and 6 should not.
Does anyone have any idea how what I should change it to in order to achieve this? I am not asking for anyone to just do all the work for me and give me the answer – I am in this to learn. I have struggled with this for a few hours now so any assistance with this would be great!
This kind of thing is usually easier to do with lookaheads than lookbehinds. This works as you requested:
The idea here is to look for a
</textarea>tag, but only if you don’t encounter a<textarea...>tag first–that’s this part:Assuming the HTML is well formatted, that regex could only match inside a textarea element. Putting it in a negative lookahead which is executed after the
[tag]has been matched causes matches in textareas to be rejected.