I have a rather strange question regarding urls which point to another url. So, for example, I have a url:
http://mywebpage/this/is/a/forward
which ultimately points to another url:
http://mynewpage/this/is/new
My question is, when I use for example urllib2 in python to fetch the first page, it ultimately fetches the second page. I would like to know if its possible to know what the original link is pointing to. Is there something like a “header” which tells me the second link when I request the first link?
Sorry if this is a really silly question!
When you issue a GET request for the first URL, the web server will return a 300-series reply code, with a
Locationheader whose value is the second URL. You can find out what the second URL was from Python with thegeturlmethod of the object returned byurlopen. If there is more than one redirection involved, it appears that urllib will tell you the last hop and there’s no way to get the others.This will not handle redirections via JavaScript or
meta http-equiv="refresh", but you probably aren’t in that situation or you wouldn’t have asked the question the way you did.