I have following html:
<td class="section">
<div style="margin-top:2px; margin-bottom:-10px; ">
<span class="username"><a href="user.php?id=xx">xxUsername</a></span>
</div>
<br>
<span class="comment">
A test comment
</span>
</td>
All I want to retrieve xxUsername and comment text within SPAN tag. So far I have done this:
results = soup.findAll("td", {"class" : "section"})
It does fetches ALL html blocks of the pattern I mentioned above. Now I want to retrieve all children value within a single loop? Is it possible? If not then how do I fetch child nodes information?
You could try something like this. It basically does what you did above – first iterates through all
section-classedtd‘s and then iterates through allspantext within. This prints out the class, just in case you needed to be more restrictive:Or with a more-convoluted-than-necessary one-liner that will store everything back in your list:
Or on that same theme, a dictionary with the keys being a tuple of the classes and the values being the text itself:
Assuming this one is bit closer to what you want, I would suggest rewriting as: