I’m using JSoup to parse this HTML content:
<div class="submitted">
<strong><a title="View user profile." href="/user/1">user1</a></strong>
on 27/09/2011 - 15:17
<span class="via"><a href="/goto/002">www.google.com</a></span>
</div>
Which looks like this in web browser:
user1 on 27/09/2011 - 15:17 www.google.com
The username and the website can be parsed into variables using this:
String user = content.getElementsByClass("submitted").first().getElementsByTag("strong").first().text();
String website = content.getElementsByClass("submitted").first().getElementsByClass("via").first().text();
But I’m unsure of how to get the "on 27/09/2011 -15:17" into a variable, if I use
String date = content.getElementsByClass("submitted").first().text();
It also contains username and the website???
You can always remove the
userand thewebsiteelements like this (you can clone yoursubmittedelement if you do not want the remove actions to “damage” your document):Outputs: