I am using HtmlAgilityPack to parse my html doc, but I can’t get the html correct.
For example:
string s="<!DOCTYPE html>
<li>Voltage: <0.05% + 10 mV
(<0.1% + 25 mV for output 2 of E3646/47/48/49A)</li>
</html>";
HtmlAgilityPack.HtmlDocument doc;
doc.LoadHtml(s);
But I get:
"<li>Voltage: <0.05% +="" 10="" mv=""></0.05%><0.1% +="" 25="" mv="" for="" output="" 2="" of=""></0.1%></li>"
instead of:
"<li>Voltage: <0.05% + 10 mV (<0.1% + 25 mV for output 2 of E3646/47/48/49A)</li>"
What is the problem?
p.s. I have an another html doc with utf-8 encoding and it does not have a problem.
You have
<in the text of theli, causingmVetc… to be interpreted as attributes of the0.05%element (it is interpreted as an element, as there is a<preceding it).You should escape these to
<.