I’m trying to parse HTML file with libxml2. Usually this works fine, but not in this case:
<p>
<b>Titles</b>
(Some Text)
<table>
<tr>
<td valign="top">
…Something1...
</td>
<td align="right" valign="top">
…Something2...
</td>
</tr>
</table>
</p>
I do this query to get the first <td>
//p[b='Titles']/table/tr/td[0]
but nothing is returned because libxml think that <table> tag is not a child of a tag <p> and following him.
And finally the question WHY?
Are you using HTML or XML parser? AFAIR, HTML allows only inline elements inside
<p>(you cannot put<table>in<p>), so that it auto-closes<p>tag after seeing<table>tag (in HTML, you don’t have to close every tag). So, your HTML is roughly equivalent to (attributes omitted):Try using XML parser form libxml instead of HTML.