I’m creating a java program that will read a html document from a URL and display the sizes of the images in the code. I’m not sure how to go about achieving this though.
I wouldn’t need to actually download and save the images, i just need the sizes and the order in which they appear on the webpage.
for example:
a webpage has 3 images
<img src="dog.jpg" /> //which is 54kb
<img src="cat.jpg" /> //which is 75kb
<img src="horse.jpg"/> //which is 80kb
i would need the output of my java program to display
54kb
75kb
80kb
Any ideas where i should start?
p.s I’m a bit of a java newbie
If you’re new to Java you may want to leverage an existing library to make things a bit easier. Jsoup allows you to fetch an HTML page and extract elements using CSS-style selectors.
This is just a quick and very dirty example but I think it will show how easy Jsoup can make such a task. Please note that error handling and response-code handling was omitted, I merely wanted to pass on the general idea:
You might also consider looking at Apache HttpClient. I find it generally preferable over the raw URLConnection/HttpURLConnection approach.