I am writing a very basic web spider in java.I am facing one problem, that content loaded for same url is different than that in browser.For example try below URL.
If you load this url in browser, and through JAVA URL class, the contents are different.This may be because of the following reasons.
- Javascript may be sending
XMLHTTPrequests and concatenating the
result to render final HTML. - URL redirects may finally render the
HTML. - Any other reasons, that I dont know.
So is there a way that I simulate browser in my java program.Are There any third party libraries, that loads the page similar to what browser does and finally return the content.Any help is appreciated.
try htmlunit it can emulate browser behaviour and handle javascript