I access a webpage by passing the session id and url and output is a HTML response.
I want to use jSoup to parse this response and get the tag elements.
I see the examples in Jsoup takes a String for establishing connection. How do i proceed.
pseudo code:
I tried the above method and got this exception
java.io.IOException: 401 error loading URL http://www.abc.com/index
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:387)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
Basically the entity.getContent() has the HTML response which has to be passed as a String to the connect method. But it doesn’t work.
Apache Commons HttpClient and Jsoup do not share the same cookie store. You basically need to pass the very same cookies as HttpClient has retrieved back through Jsoup’s
Connection. You can find some concrete examples here:Alternatively, you can also just continue using HttpClient for firing HTTP requests and maintaining the cookies and instead feeds its
HttpResponseasStringthroughJsoup#parse().So this should do:
By the way, you do not necessarily need to create a whole new
HttpClientfor a subsequent request. Just reusehttpclientwhich you already created. Also your way of obtaining the response asStringis clumsy. The second line in the above example shows how to do it at simplest.