I need to extract text from a node like this:
<div>
Some text <b>with tags</b> might go here.
<p>Also there are paragraphs</p>
More text can go without paragraphs<br/>
</div>
And I need to build:
Some text <b>with tags</b> might go here.
Also there are paragraphs
More text can go without paragraphs
Element.text returns just all content of the div. Element.ownText – everything that is not inside children elements. Both are wrong. Iterating through children ignores text nodes.
Is there are way to iterate contents of an element to receive text nodes as well. E.g.
- Text node – Some text
- Node <b> – with tags
- Text node – might go here.
- Node <p> – Also there are paragraphs
- Text node – More text can go without paragraphs
- Node <br> – <empty>
Element.children() returns an Elements object – a list of Element objects. Looking at the parent class, Node, you’ll see methods to give you access to arbitrary nodes, not just Elements, such as Node.childNodes().
Result: