a = {"a":"çö"}
b = "çö"
a['a']
>>> '\xc3\xa7\xc3\xb6'
b.decode('utf-8') == a['a']
>>> False
What is going in there?
edit= I’m sorry, it was my mistake. It is still False. I’m using Python 2.6 on Ubuntu 10.04.
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Possible solutions
Either write like this:
Or like this (you may also skip the
.decode('utf-8')on both sides):Or like this (my recommendation):
Explanation
Updated based on Tim’s comment. In your original code,
b.decode('utf-8') == u'çö'anda['a'] == 'çö', so you’re actually making the following comparison:One of the objects is of type
unicode, the other is of typestr, so in order to execute the comparison, thestris converted tounicodeand then the twounicodeobjects are compared. It works fine in the case of purely ASCII strings, for example:u'a' == 'a', sinceunicode('a') == u'a'.However, it fails in case of
u'çö' == 'çö', sinceunicode('çö')returns the following error: UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 0: ordinal not in range(128), and therefore the whole comparison returns False and issues the following warning: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode – interpreting them as being unequal.