Possible Duplicate:
Java HTTP getResponseCode returns 200 for non-existent URL
Hello, my goal is to build an application that determines the validity of HTML links, however in my following code :
try
{
// create the HttpURLConnection
URL url = new URL("http://www.thisurldoesnotexist");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
System.out.println("Response code is " + connection.getResponseCode());
}
the nonsense URL is resolving to an IP address, which I did not expect, and the code returns the product : “Response code is 200”
It seems my approach to distinguishing between actual and non-existent pages is flawed. Does anyone know if I am applying the wrong tools toward determining the validity of web pages . . . i.e., is there a better way to differentiate between existent and non-existent web pages . . . thanks so much,
You could:
This however will add complexity since you will need to make a simple GET request through the socket. Then validate the response so you’re sure that its actually a HTTP server running on port 80.
NMap might be able to help you here.