I have been using Jsoup for parsing my HTML files and so far it does a great job. However, it’s not able to parse any server tags ( <% … %> ). I decided to extend it but I cannot find an easy way to extend its Parser and all those private/package level classes (i.e. TreeBuilder, TransitionState … etc)…
So I started looking at Jericho as it claims it can parse server tags – however, its documentation is so poor that I cannot even get started easily. And seems like its API is not as friendly as what Jsoup provides – it’s not that straight forward to extract some nodes and move it around …
I wonder if anyone has the similar situation before and how you get it solved? In short, I just want to parse JSP files in Java. (Well .. please don’t ask me to implement one by myself ;p )
Lastly I get a workaround: put server code block in a HTML comment block so that 1) server code can get executed correctly; 2) Jsoup can process the whole block as a HTML comment node without touching anything inside.
e.g.
It works well for me now! Hope ppl who got the same problem could get some help ! 😉