I need to find and replace all text matches in a case insensitive way, unless the text is within an anchor tag – for example:
<p>Match this text and replace it</p>
<p>Don't <a href="/">match this text</a></p>
<p>We still need to match this text and replace it</p>
Searching for ‘match this text’ would only replace the first instance and last instance.
[Edit] As per Gordon’s comment, it may be preferred to use DOMDocument in this instance. I’m not at all familiar with the DOMDocument extension, and would really appreciate some basic examples for this functionality.
Here is an UTF-8 safe solution, which not only works with properly formatted documents, but also with document fragments.
The mb_convert_encoding is needed, because loadHtml() seems to has a bug with UTF-8 encoding (see here and here).
The mb_substr is trimming the body tag from the output, this way you get back your original content without any additional markup.
References:
1. find and replace keywords by hyperlinks in an html fragment, via php dom
2. Regex / DOMDocument – match and replace text not in a link
3. php problem with russian language
4. Why Does DOM Change Encoding?
I read dozens of answers in the subject, so I am sorry if I forgot somebody (please comment it and I will add yours as well in this case).
Thanks for Gordon and stillstanding for commenting on my other answer.