I have the following html table element:
<table class='myTable'>
<tbody>
<tr>
<th>header1</th>
<td>data1</td>
</tr>
<tr>
<th>header2</th>
<td><table><tbody><tr><th>subheader1</th><td>subdata1</td></tr>
<tr><th>subheader2</th><td>subdata2</td></tr>
</tbody></table></td>
</tr>
<tr>
<th>header3</th>
<td>data3</td>
</tr>
....
<tbody>
</table>
How could I select the headers in the table, where those headers’s next td element does not contain a table. In the case above, only select header header1 and header3.
What I have at the moment is
Elements elements = doc.select("table[class=" + myTable + "]);
Element table;
if(elements.size()>0){
table = elements.get(0);
}
else{
return someMyObj;
}
Iterator<Element> ite = table.select("th AND SOME CONDITIONS").iterator();
while(ite.hasNext()){
Element header = ite.next();
}
Try this
The selector selects all th children of tr, that don’t contain table and in turn are children of tbody of the context element.
BTW I changed your while loop to for loop, but the idea stays the same.