I’m trying to scrape an HTML table full of data on a website. Unfortunately, the source code for the table looks like this:
<table border="1" cellspacing="0" cellpadding="3">
<tr>
<td bgcolor="silver"><font face="arial,helvetica" size="1">Last Name</font></td>
<td bgcolor="silver"><font face="arial,helvetica" size="1">First Name</font></td>
<td bgcolor="silver"><font face="arial,helvetica" size="1">Middle</font></td>
</tr>
<td valign="top"><font face="arial,helvetica" size="1">
Data</font></td>
<td valign="top"><font face="arial,helvetica" size="1">
Data</font></td>
<td valign="top"><font face="arial,helvetica" size="1">
Data</font></td>
</tr>
<td valign="top"><font face="arial,helvetica" size="1">
More Data</font></td>
<td valign="top"><font face="arial,helvetica" size="1">
More Data</font></td>
<td valign="top"><font face="arial,helvetica" size="1">
More Data</font></td>
</tr>
</table>
Note the lack of staring “tr” tags for each row after the header. The table shows up fine in a browser, but the html agility pack will not recognized the tr elements with no start tag. Is there anyway I can get the html agility pack to fix this issue? Id rather not insert the tr tags myself, but will if I have to.
You can try to parse the
tds and group them by 3 items,