I would like to get some specific values in a (x)html file using an xpath statement.
I read the html file like an xml file using the xmlDocument class.
Problem is that my xpath query doesn’t work because of the namespace defined in the html tag:
<html xmlns="http://www.w3.org/1999/xhtml">
If I remove the xmlns in the html tag, it works fine.
What’s wrong?
(I don’t want to use the Html Agility Pack)
Thanks!
Here’s my code:
XmlDocument readDoc = new XmlDocument();
System.Xml.XmlNamespaceManager xmlnsManager = new System.Xml.XmlNamespaceManager(readDoc.NameTable);
readDoc.XmlResolver = null;
xmlnsManager.AddNamespace("html", "http://www.w3.org/1999/xhtml");
readDoc.Load("myHTML.html");
int count = readDoc.SelectNodes("//html/body/div/span[@class='layout']",xmlnsManager).Count;
Since your element is in the namespace, your XPath statement must include the namespace too (including for the sub-elements)…