I am trying to crawl a web page that is built using GWT and uses the GWT RPC mechanism for AJAX calls. The page I am trying to crawl is not mine – so I can’t edit the server side. I am very new to GWT and from my initial couple of days with it – I think that you can’t de-serialize the data unless you’ve the case interface with you.
Am I right or Is there a way to crawl the data intelligently?
You could do it using htmlunit and WebClient:
You might have to experiment with the WebClient options a bit. In my case these seem to do a good job: