class sss(webapp.RequestHandler):
def get(self):
url = "http://www.google.com/"
result = urlfetch.fetch(url)
if result.status_code == 200:
self.response.out.write(result.content)
When I change code to this:
if result.status_code == 200:
self.response.out.write(result.content.decode('utf-8').encode('gb2312'))
It shows something strange. What should I do?
When I use this:
self.response.out.write(result.content.decode('big5'))
The page is different with the one I saw Google.com.
How to get Google.com that I saw?
Google is probably serving you ISO-8859-1. At least, that is what they serve me for the User-Agent “AppEngine-Google; (+http://code.google.com/appengine)” (which urlfetch uses). The Content-Type header value is:
So you would use:
If you check
result.headers["Content-Type"], your code can adapt to changes on the other end. You can generally pass the charset (ISO-8859-1 in this case) directly to the Python decode method.