I’m running a python program on a user’s computer in Portugal, where the user’s username contains unicode characters. I would like to have os.path.expanduser('~') return something functional since I use the resulting path for some file operations, but it currently returns a python str representation of a unicode string:
>>> import os
>>> os.path.expanduser('~')
'C:\\Users\\V\xe2nia'
But this is a python string … how can I convert this to an actual unicode string that Windows will recognize as a valid filepath?
The function returned a byte string, not a unicode string. You need to decode it, given the encoding used for the string.
I’m making the presumption here that the encoding used was the filesystem encoding, which is avilable via
sys.getfilesystemencoding(). It looks like latin-1 from here, but you can’t be certain.You can also try to pass in a unicode path to
os.path.expanduser()and have Python do the decoding for you:Please read up on this and other Unicode issues in the Python Unicode HOWTO. If you don’t understand the difference between a encoded bytestring and a Unicode string, please do read this excellent article as well.