My html looks like:
<td>
<table ..>
<tr>
<th ..>price</th>
<th>$99.99</th>
</tr>
</table>
</td>
So I am in the current table cell, how would I get the 99.99 value?
I have so far:
td[3].findChild('th')
But I need to do:
Find th with text ‘price’, then get next th tag’s string value.
Think about it in “steps”… given that some
xis the root of the subtree you’re considering,is the list of all items in that subtree containing text
'price'. The parents of those items then of course will be:and if you only want to keep those whose “name” (tag) is
'th', then of courseand you want the “next siblings” of those (but only if they’re also
'th's), soHere you see the problem with using a list comprehension: too much repetition, since we can’t assign intermediate results to simple names. Let’s therefore switch to a good old loop…:
Edit: added tolerance for a string of text between the parent
thand the “next sibling” as well as tolerance for the latter being atdinstead, per OP’s comment.I’ve added
ns.string, that will give the next sibling’s contents if and only if they’re just text (no further nested tags) — of course you can instead analize further at this point, depends on your application’s needs!-). Similarly, I imagine you won’t be doing justprintbut something smarter, but I’m giving you the structure.Talking about the structure, notice that twice I use
if...: continue: this reduces nesting compared to the alternative of inverting theif‘s condition and indenting all the following statements in the loop — and “flat is better than nested” is one of the koans in the Zen of Python (import thisat an interactive prompt to see them all and meditate;-).