I am using lxml to get a string from a web page. What do I have to do to get a string of the data that I extract without having the below error? I guess I just can’t using str() to solve the problem.
In python:
mystring = MySQLdb.escape_string(i.text_content())
(<type 'exceptions.UnicodeEncodeError'>, UnicodeEncodeError('ascii', u"\n\nEve Pownall\n\n \n \n \n \n Eve Pownall\n\t (Author)\n\t\n \u203a Visit Amazon's Eve Pownall Page\n Find all the books, read about the author, and more.\n\n See search results for this author \n Are you an author?\n Learn about Author Central\n \n \n \n \n\n \n amznJQ.onReady('bylinePopover', function () {});\n \n\n\n (Author)\n\n\n\n\n\n\n\n\n\n\n", 75, 76, 'ordinal not in range(128)'), <traceback object at 0x7f225c99f050>)
You need to explicitly encode the string in a good known encoding (UTF-8 most likely).
More info:
http://collective-docs.readthedocs.org/en/latest/troubleshooting/unicode.html