I am using Jsoup Java HTML parser to fetch images from a particular URL. But some of the images are throwing a status 502 error code and are not saved to my machine. Here is the code snapshot i have used:-
String url = "http://www.jabong.com";
String html = Jsoup.connect(url.toString()).get().html();
Document doc = Jsoup.parse(html, url);
images = doc.select("img");
for (Element element : images) {
String imgSrc = element.attr("abs:src");
log.info(imgSrc);
if (imgSrc != "") {
saveFromUrl(imgSrc, dirPath+"/" + nameCounter + ".jpg");
try {
Thread.sleep(3000);
} catch (InterruptedException e) {
log.error("error in sleeping");
}
nameCounter++;
}
}
And the saveFromURL function looks like this:-
public static void saveFromUrl(String Url, String destinationFile) {
try {
URL url = new URL(Url);
InputStream is = url.openStream();
OutputStream os = new FileOutputStream(destinationFile);
byte[] b = new byte[2048];
int length;
while ((length = is.read(b)) != -1) {
os.write(b, 0, length);
}
is.close();
os.close();
} catch (IOException e) {
log.error("Error in saving file from url:" + Url);
//e.printStackTrace();
}
}
I searched on internet about status code 502 but it says error is due to bad gateway. I don’t understand this. One of the possible things i am thinking that this error may be because of I am sending get request to images in loop. May be webserver is not able handle to this much load so denying the request to the images when previous image is not sent.So I tried to put sleep after fetching every image but no luck 🙁
Some advices please
Here’s a full code example that works for me…
You should see the following ouput on your console…
So that’s a working example without a Proxy server involved.
Only if you require authentication with a proxy server here’s an additional Class that you’ll need based on this Oracle technote
And to use this new Class you would use the following code in place of the call to openConnection() shown above