Does anyone have any experience with this?
I have been using python 3.2 for the last half a year, and my memory of 2.6.2 is not that great.
On my computer the following code works, tested using 2.6.1:
import contextlib
import codecs
def readfile(path):
with contextlib.closing( codecs.open( path, 'r', 'utf-8' )) as f:
for line in f:
yield line
path = '/path/to/norsk/verbs.txt'
for i in readfile(path):
print i
but on the phone it gets to the first special character ø and throws:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf8' in position 3: ordinal not in range(128)
any ideas as I am going to need to input them as well as read form a file?
Printing is an I/O operation. I/O requires bytes. What you have in
iis unicode, or characters. Characters only convert directly to bytes when we’re talking about ascii, but on your phone you have encountered a non-ascii character (u’\xf8′ is ø). To convert characters to bytes, you need to encode them.As to why this works on your code works on one machine and not the other, I bet python’s autodetection has found different things in those cases. Run this on each device:
I expect you’ll see utf8 on one and ascii on the other. This is what print uses when the destination is a terminal. If you’re sure that all users of your python installation (very possibly just you) prefer utf8 over ascii, you can change the default encoding of your python installation.
python -c 'import site; print siteOpen it and find the setencoding function:
Change the
encoding = "ascii"line toencoding = "UTF-8"Enjoy as things Just Work. You can find more information on this topic here: http://blog.ianbicking.org/illusive-setdefaultencoding.html
If you’d instead like a strict separation of bytes vs characters such as python3 provides, you can set
encoding = "undefined". Theundefinedcodec will “Raise an exception for all conversions. Can be used as the system encoding if no automatic coercion between byte and Unicode strings is desired.“