If I use Urllib2 to open a url using this:
import urllib
import urllib2
url = 'http://www.bbc.co.uk'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
values = {}
headers = { 'User-Agent' : user_agent }
data = urllib.urlencode(values)
req = urllib2.Request(url, data, headers)
response = urllib2.urlopen(req)
the_page = response.read()
It all works fine
But I want the mobile version so I set the user-agent to:
user_agent = 'Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8G4 Safari/6533.18.5'
Which is what my iphone comes back with when go to a test page and read its headers
However if I run the above code with the user agent set to this urllib freaks out and seems to follow an indefinite 302 redirection loop which doesn’t occur when I visit the site on my iphone.
urllib2 comes back with a whole heap of debug info showing that it is following lots of 302’s and then finally:
urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Permanently
Any ideas would be gratefully received.
Your problem is redirect responses of your request.
Try this lib to help you to handle redirect urls:
http://pypi.python.org/pypi/requests/0.7.3
or
http://wwwsearch.sourceforge.net/mechanize/