Suppose I have a div as such:
<div>
This is a paragraph
written by someone
on the internet.
</div>
The problem is that when JSoup parses this, it puts it all on one line, so that when I call text() it reads as such:
This is a paragraphwritten by someoneon the internet.
Now, I realize this isn’t really a JSoup problem, in that the actual html doesn’t contain a space. However, is there any way to use JSoup (perhaps some override or maybe an option I haven’t seen) so that as it parses it will add a space between lines? I imagine it must be possible (as I can inspect element in Chrome and unselect word wrap and it gets what I want) but I’m not sure JSoup can do this.
Any thoughts?
the following post shows how you get everything including the line break
Removing HTML entities while preserving line breaks with JSoup
the answer and comment in the following also has another way (read the comment in it)
Remove HTML tags from a String
and this one has even another way if you check all the answers and the comments
How do I preserve line breaks when using jsoup to convert html to plain text?