We have a task, to design a class which can download source of any web page. But when I try to test my code and fetch page like http://anidb.net/perl-bin/animedb.pl?show=main – nothing is working.
A standard code like this fails:
import java.net.*;
import java.io.*;
public class URLReader {
public static void main(String[] args) throws Exception {
URL link = new URL("http://www.anidb.net/");
BufferedReader in = new BufferedReader(
new InputStreamReader(link.openStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
Here is the result I got:
Šwq>²"¦§5´_ï__ÇUº=ôÙö?kŠ}~“bd`?l“Ïçz¢Çêõ>_"?j׉R“y}K¸\Ìc_DLÙªÏ_
–óMm_¼_0”•ö°ËC_aí½sî¤ìÁS ‚>dC0ìs_–y¹ñ±ÏÝÜAø%È_äÖá__æ©A@,4x„ж_ëɃ?
I have tried everything: cookies, header files but nothing seems to work. If you have some hint for me, I will appreciate it.
Writing a http client, you have to take gzip encoding into account as well as chunked transfer. Its better to use a library to download a webpage.
Try something like this:
http://code.google.com/p/google-http-java-client/