I am using ElementTree and cannot figure out if the childnode is text or not. childelement.text does not seem to work as it gives false positive even on nodes which are not text nodes.
Any suggestions?
Example
<tr>
<td><a href="sdas3">something for link</a></td>
<td>tttttk</td>
<td><a href="tyty">tyt for link</a></td>
</tr>
After parsing this xml file, I do this in Python:
for elem_main in container_trs: #elem_main is each tr
elem0 = elem_main.getchildren()[0] #td[0]
elem1 = elem_main.getchildren()[1] #td[1]
elem0 = elem_main.getchildren()[0]
print elem0.text
elem1 = elem_main.getchildren()[1]
print elem1.text
The above code does not output elem0.text; it is blank. I do see the elem1.text (that is, tttttk) in the output.
Update 2
I am actually building a dictionary. The text from the element with each so that I can sort the HTML table. How would I get the s in this code?
How about using the
getiteratormethod to iterate through the all the descendant nodes:The loop
for elem_main in container_trs:iterates through the children ofcantainer_trs.In contrast, the loop
for elem_main in container_trs.getiterator():iteraters throughcontainer_trsitself, and its children, and grand-children, etc.