I used page.prettify() to tidy up the HTML, and this is the text that I want to extract now:
<div class="item">
<b>
name
</b>
<br/>
stuff here
</div>
My target is to extract the stuff here from there, but I am stumped as it is not wrapped in any tags except that div, which has other stuff in it already. And also the additional whitespace in front of every line makes it harder.
What would be the way to do this?
A combination of find and nextSibling works for the example that you posted.