I am trying to parse the table below but unfortunately each of nodes seems nested each other. 🙁 It is impossible to get the childnodes because it is always giving the count = 1
It is really interesting but it is finding; for example the next “tr” as the childnode of the previous tr?
Do you have any idea?
<table width="292px" border="0">
<tr>
<td>
</td>
</tr>
<tr>
<td>
<table>
<tr>
<td colspan="2" bgcolor="#FBCE9D" align="center" height="40">
</td>
</tr>
<tr>
<td bgcolor="#FFF4D2" height="25" width="60">
</td>
<td height="25" bgcolor="#e8e8e8">
</td>
</tr>
<tr>
<td bgcolor="#FFF4D2" height="25" width="60">
</td>
<td height="25" bgcolor="#e8e8e8">
</td>
</tr>
<tr>
<td bgcolor="#FFF4D2" height="25" width="60">
</td>
<td height="25" bgcolor="#e8e8e8">
</td>
</tr>
<tr>
<td bgcolor="#FFF4D2" height="25" width="60">
</td>
<td height="25" bgcolor="#e8e8e8">
</td> //Here is a missing "</tr>" and I think this one is confusing the agilitypack!
<tr>
<td bgcolor="#FFF4D2" height="35" colspan="2" align="center">
</td>
</tr>
</table>
</td>
</tr>
</table>
My code is:
var webGet = new HtmlWeb();
var doc = webGet.Load("the url where this table is located");
HtmlNodeCollection tb = doc.DocumentNode.SelectNodes("//table[@width='292px']");
var table = tb[0].ChildNodes[1].ChildNodes[0].ChildNodes[0].ChildNodes;
for (var na = 0; na < table.Count; na++)
{ .....do the work.... }
Actually this code was working like a charm before but they nested another table inside it is stucking with ChildNodes[1] because there will be no ChildNodes[1] it is always ChildNodes[0]?
One more note; Firebug shows “/html/body/table/tbody/tr[2]/td/table/tbody” as the XPath of the nested table but as you may notice that “tbody” is not familiar with htmlagility because it is dynamically produced by the browser to eleminate the missing close tag /tr
It is really interesting but the problem was HmtlAgility pack that is actually available at Nuget! I removed it and download it from the web (http://htmlagilitypack.codeplex.com/). It is working now!