I need to extract text from an HTML file using C#.
I am trying to use HTMLAgilityPack but I am seeing some parse errors (tags not closed).
I am using these two options:
htmlDoc.OptionFixNestedTags = true;
htmlDoc.OptionAutoCloseOnEnd = true;
Is there any “Fix all” type option. I don’t care about the errors, I just want the content or close.
Maybe this is workaround but once I had to extract text from HTML I used regex: