I have a 1GB json file and i want to parse it using simplejson in python. So i wrote a simple code as below which works fine
import simplejson
f=open('stem.json','r')
content=f.read()
data=simplejson.loads(content)
The problem with the above code is that it does not read data in ‘utf-8’ format
So i rewrote the code as below
import simplejson
import codecs
f=codecs.open('stem.json','r',encoding='utf-8')
content=f.read()
data=simplejson.loads(content)
The problem with the above code is it wont execute, kernel is “Killing” the program.
I feel this problem is strange because without encoding it works and when i try to read it with encoding it takes lot of memory
Can anyone tell me whats happening here ?
You could try to open the file normally and use
simplejson.load()with anencodingparameter instead of reading the whole file into memory first:As I said in the comment above, I think the real solution is to use a different persistence backend, other than serialising to JSON.