I’m receiving some data from a ZODB (Zope Object Database). I receive a mybrains object. Then I do:
o = mybrains.getObject()
and I receive a “Person” object in my project. Then, I can do
b = o.name
and doing print b on my class I get:
José Carlos
and print b.name.__class__
<type 'unicode'>
I have a lot of “Person” objects. They are added to a list.
names = [o.nome, o1.nome, o2.nome]
Then, I trying to create a text file with this data.
delimiter = ';'
all = delimiter.join(names) + '\n'
No problem. Now, when I do a print all I have:
José Carlos;Jonas;Natália
Juan;John
But when I try to create a file of it:
f = open("/tmp/test.txt", "w")
f.write(all)
I get an error like this (the positions aren’t exaclty the same, since I change the names)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 84: ordinal not in range(128)
If I can print already with the “correct” form to display it, why I can’t write a file with it? Which encode/decode method should I use to write a file with this data?
I’m using Python 2.4.5 (can’t upgrade it)
writeis trying to encode the string using the ascii codec (which doesn’t have a way of encoding accented characters like é or à.Instead use
or choose some other codec (like cp1252) which can encode the characters in your string.
PS.
all.decode('utf-8')was used above becausef.writeexpects a unicode string. Better than usingall.decode('utf-8')would be to convert all your strings to unicode early, work in unicode, and encode to a specific encoding like ‘utf-8’ late — only when you have to.PPS. It looks like
namesmight already be a list of unicode strings. In that case, definedelimiterto be a unicode string too:delimiter = u';', soallwill be a unicode string. Thenshould work (unless there is some issue with Python 2.4 that I’m not aware of.)
If ‘utf-8’ does not work, remember to try other encodings that contain the characters you need, and that your computer knows about. On Windows, that might mean ‘cp1252’.