Using python to pick it some pieces so definitely a noob ? here but didn’t seeing a satisfactory answer.
I have a json utf-8 file with some pieces that have grave’s, accute’s etc…. I’m using codecs and have (for example):
str=codecs.open('../../publish_scripts/locations.json', 'r','utf-8')
locations=json.load(str)
for location in locations:
print location['name']
For print’ing, does anything special need to be done? It’s giving me the following
ascii’ codec can’t encode character u’\xe9′ in position 5
It looks like the correct utf-8 value for e-accute. I suspect I’m doing something wrong with print’ing. Would the iteration cause it to lose it’s utf-8’ness?
PHP and Ruby versions handle the utf-8 piece fine; is there some looseness in those languages that python won’t do?
thx
codec.open() will decode the contents of the file using the codec you supplied (utf-8). You then have a python unicode object (which behaves similarly to a string object).
Printing a unicode object will cause an implict (behind-the-scenes) encode using the default codec, which is usually
ascii. Ifasciicannot encode all of the characters present it will fail.To print it, you should first encode it, thus:
EDIT:
For your info,
json.load()actually takes a file-like object (which is whatcodecs.open()returns). What you have at that point is neither a string nor a unicode object, but an iterable wrapper around the file.By default
json.load()expects the file to be utf8 encoded so your code snippet can be simplified: