I tried to backport a Python 3 program to 2.7, and I’m stuck with a strange problem:
>>> import io
>>> import csv
>>> output = io.StringIO()
>>> output.write("Hello!") # Fail: io.StringIO expects Unicode
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
>>> output.write(u"Hello!") # This works as expected.
6L
>>> writer = csv.writer(output) # Now let's try this with the csv module:
>>> csvdata = [u"Hello", u"Goodbye"] # Look ma, all Unicode! (?)
>>> writer.writerow(csvdata) # Sadly, no.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
According to the docs, io.StringIO() returns an in-memory stream for Unicode text. It works correctly when I try and feed it a Unicode string manually. Why does it fail in conjunction with the csv module, even if all the strings being written are Unicode strings? Where does the str come from that causes the Exception?
(I do know that I can use StringIO.StringIO() instead, but I’m wondering what’s wrong with io.StringIO() in this scenario)
The Python 2.7
csvmodule doesn’t support Unicode input: see the note at the beginning of the documentation.It seems that you’ll have to encode the Unicode strings to byte strings, and use
io.BytesIO, instead ofio.StringIO.The examples section of the documentation includes examples for a
UnicodeReaderandUnicodeWriterwrapper classes (thanks @AlexeyKachayev for the pointer).