I wand to develop http client in Java for college project which login to site, obtain data from HTML data, complete and send forms.
I don’t know which http lib to use :
Apache HTTP client – don’t create DOM model but work with http redirects, multi threading.
HTTPUnit – create DOM model and is easy to work with forms, fields, tables etc. but I don’t know how will work with multi-threading and proxy settings.
Any advice ?
It sounds like you are trying to create a web-scraping application. For this purpose, I recommend the HtmlUnit library.
It makes it easy to work with forms, proxies, and data embedded in web pages. Under the hood I think it uses Apache’s HttpClient to handle HTTP requests, but this is probably too low-level for you to be worried about.
With this library you can control a web page in Java the same way you would control it in a web browser: clicking a button, typing text, selecting values.
Here are some examples from HtmlUnit’s getting started page:
Submitting a form:
Using a proxy server:
The
WebClientclass is single threaded, so every thread that deals with a web page will need its ownWebClientinstance.Unless you need to process Javascript or CSS, you can also disable these when you create the client: