I have been trying to work through a RegEx that I could use to replace all < and > text strings, EXCEPT for when those strings are part of an HTML tag.
For example:
var str = "<p>The <b>value</b> <i>1</i> is < <u>2</u></p>"
Given the above example, I want a resultant string that looks like this:
var str = "<p>The <b>value</b> <i>1</i> is < <u>2</u></p>"
This is not easy. See the authoritative answer to a related question here.
Regular expressions are not built for this type of parsing. Even tokenizing or dom parsing can cause problems. The title of your question illustrates the problem:
Replace all < and > that are NOT part of an HTML tagHow can your parser know if
< and >is an<AND>tag, or simply two orphan angle brackets around the wordand?An HTML parser is probably your best bet, but how the orphan brackets are handled is key. Also, you would need to look for unmatched tags or illegal tags to catch cases such as the title of your question.