I want to load external content (from another domain) and simulate navigation, doing things like programatically clicks and fill forms, probably using JQuery.
Explaining a little better: I need to navigate “automatically” through 3 pages, the first one is login area, where I’m supposed to fill login/pass fields, and submit. In the last one, I must fill some input fields, submit again, and get all html data from a report.
I was trying using a IFRAME and Jquery’s contents(), then I realized that I cannot do that due obvious XSS security issues. (http://jsfiddle.net/TbMyx/4/).
Before trying this way (client-side, js, Iframe, etc), I also tried using Java. Sending POST/GET requisitions into a Servlet, and I didn’t got any sucess on that either.
Any thoughts on that? At least, it’s possible task?
I’m a little negative on that, I don’t think this is really possible, based in my current knowledge, I just need some confirmation
Yes, it is possible. Its called Web Scraping, and is fairly common.
As you have learnt, it is not possible to do this on the client side using javascript due to security restrictions.
On the server side, you have two options. a) Load up an actual browser and navigate the website just like a user would, or b) Use a headless browser, which is basically a library that simulates a real browser.
Using a Headless Browser
In general, this is a faster and easier approach, but it may not work for complex websites that depend on javascript.
For java, HTMLUnit is a great library. Keep the fiddler request/response from your browser handy, because its possible the browser sends cookies or headers that are different from HtmlUnit. In general, if you match all the headers that the browser is sending, the website will respond correctly.
Using an Actual Browser
Use this only if your attempts with a headless browser fail. This approach brings up a browser and navigates the website just like a user does.
You can use Selenium/WebDriver for this purpose. Be warned that running a browser in a server environment is actually resource expensive, and takes more time.