I am trying to write a Java program that will load pages pointed to by valid links and report other links as broken. My problem is that the Java URL will download the appropriate page if the url is valid, and the search-engine results for the url if the url is invalid.
Is there a Java function that detects if the url resolves to a legitimate page . . . thanks very much,
Joel
You can get the HTTP response code for a URL like so:
Now the question is, what do you consider a “valid” webpage? For me, if a URL parses correctly and it’s protocol is “http” (or https) and it’s response code is in the 200 block or 302 (Found/Redirect) or 304 (Not modified), then it’s valid: