I am currently using Jsoup to parse a html. The code is quite simple:
Document doc = null;
try{
doc = Jsoup.connect(link).get();
}
catch (Exception e) {
//System.out.println("Some error occured.");
textView.setText(e.getMessage());
}
It do gives me the webpage I want, later I can extract the data I need from that webpage with it’s getElementsByTag method and so on. However, I only want to use part of the webpage, for example, I wish to abandon everything after < ! — / foo –> in my webpage. (Actually It’s does not have blank between < and !, but I can’t type that here.) Is there any way of abandon the webpage after that string and get the new Document with only the part I want? I checked the cookbook, but it seems only process the webpage in it’s structure, so I am not quite sure is it OK to do something like string remove. Thanks for your reading.
You can use Document doc = Jsoup.parse(html) where HTML is a page HTML. I.e. take HTML first by
then do whatever operations you need (e.g. cut HTML after marker, but add necessary closing HTML tags), then