I was trying to decode the following string and getting a error.
item = lh.fromstring(items[1].text).text_content().strip().decode('utf-8')
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u20a8' in position 0: ordinal not in range(128)
Any idea whats wrong?
items[1].text = <strong>₨ 18,500 </strong>
repr(items[1].text) = u'\u20a8 18,500'
The fact that you’ve called
decodebut your error is citingencodeis a clue that your string is Unicode to start with, not a bytestring.decodeis for converting from bytestrings to Unicode,encodeis for the other way round.