I connect to a mysql database using pymysql and after executing a request I got the following string: \xd0\xbc\xd0\xb0\xd1\x80\xd0\xba\xd0\xb0.
This should be 5 characters in utf8, but when I do print s.encode('utf-8') I get this: ╨╝╨░╤А╨║╨░. The string looks like byte representation of unicode characters, which python fails to recognize.
So what do I do to make python process them properly?
You want to
decode(notencode) to get a unicode string from a byte string.Note that you may not be able to
printit because it contains characters outside ASCII. But you should be able to see its value in a Unicode-aware debugger. I ran the above in IDLE.Update
It seems what you actually have is this:
This is trickier because you first have to get those bytes into a bytestring before you call
decode. I’m not sure what the “best” way to do that is, but this works:Note that you should of course be decoding it before you store it in the database as a string.