I’m trying to download the following page: http://structureddata.wikispaces.com/Test
wget without any option fails:
wget "http://structureddata.wikispaces.com/Test"
(...) connect to session.wikispaces.com insecurely, use `--no-check-certificate'
with –no-check-certificate, it works
wget --no-check-certificate "http://structureddata.wikispaces.com/Test"
grep Hello Test
Hello World
Now, i would like to download the same URL with java, but the following simple program:
import java.net.*;
import java.io.*;
public class Test
{
public static void main(String args[])
{
int c;
try
{
InputStream in=new URL("http://structureddata.wikispaces.com/Test").openStream();
while((c=in.read())!=-1) System.out.print((char)c);
in.close();
}
catch(Throwable err)
{
err.printStackTrace();
}
}
}
returns nothing
what should I do to download the page with java ?
Many thanks,
Ppierre
The Java URL interface is fairly low-level; it does not automatically do things like follow redirects. Your code above is getting no content to print out because there is none.
By doing something like the below, you’ll see that what you are getting is an HTTP 302 response — a redirect.
I’d suggest using a library like HTTPClient which will handle more of the protocol for you.
(credit where it is due: Copied the above code from here.)