If I read a unicode file using the following: f = open(r’file.txt’,’rU’) raw =

Question

0

Asked: June 17, 20262026-06-17T08:10:24+00:00 2026-06-17T08:10:24+00:00

If I read a unicode file using the following: f = open(r’file.txt’,’rU’) raw =

0

If I read a unicode file using the following:

f = open(r'file.txt','rU')
raw = f.read()

how can I cause the file to be read as extended ascii, that is convert \xc3\xaa to ê correctly and convert all non-displayable characters to a default character (say ?).

I also have the following:

# Create a file called sitecustomize.py in c:\python27\Lib\site-packages.
import sys
sys.setdefaultencoding('iso-8859-1')

which I’m not sure whether I need to change.

For some reason I can’t paste ê into the python console (dos in windows) put I can do:

>>> s = u'La Pe\xf1a'
>>> print s
La Peña

Anybody have any idea how to do this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T08:10:24+00:00

in python2

import codecs
f = codecs.open('file.txt','rU',encoding='utf8')

in py3 just

f = open('file.txt','rU',encoding='utf8')

To clear up confusion, there’s no such thing as “unicode file”. Unicode is a mathematical abstraction and files are bytes on your disc. In order to convert these bytes to an internal memory representation of unicode codepoints, python needs to know how to interpret them. This interpretation is called “encoding” and from your post you appear to use “utf8”. So you have to tell that to python.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

If I read a unicode file using the following: f = open(r’file.txt’,’rU’) raw =

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply