I’ve been trying to modify this method from redirecting and returning the contents of the url to returning new valid url instead.
After reading up on the Net::HTTP object, I’m still not sure how exactly the get_response method works. Is this what’s downloading the page? is there another method I could call that would just ping the url instead of downloading it?
require 'net/http'
def validate(url)
uri = URI.parse(url)
response = Net::HTTP.get_response(uri)
case response
when Net::HTTPSuccess
return response
when Net::HTTPRedirection
return validate(response['location'])
else
return nill
end
end
puts validate('http://somesite.com/somedir/mypage.html')
You are correct that
get_responsesends an HTTP GET request to the server, which requests the whole page.You want to use a HEAD request instead of GET. This requests the same HTTP response header that a GET request would get, including the status code (200, 404, etc.), but without downloading the whole page.
See the
request_headandheadmethods ofNet::HTTP. For example