Nearly all browsers use a certain amount of leeway in rendering invalid HTML. For example, they would render x < y as if it were written x < y because it is “clear” that the < is intended as a literal character, not part of an HTML tag.
Where can I find that logic as a separate “cleanup” module? Such a module would convert x < y to x < y
Try looking at the source code for Tidy.
HTML before running through Tidy:
Same HTML after running through Tidy:
Notice that
x < ywas changed tox < y.UPDATE
Based on your comment, you should probably use Tidy to clean up your HTML. I believe there are Tidy libraries for most of the common languages, that will clean up your HTML for you. If you are using PHP, there is PHP Tidy.
UPDATE
I noticed that you said you’re using C#. You can use Tidy with C# as well. Here’s something I found. I don’t develop in C# and I haven’t tried this out so YMMV:
Fix Up Your HTML with HTML Tidy and .NET