I want to know How to save a dictionary containing utf-8 characters as its keys to a file in Python with cPickle? this dictionary is very large and I’ve heard that cPickle is much faster than pickle. Also I suppose having utf-8 encoded keys is also problematic.
Any other fast solutions are also welcome.
here is what I do and below is the error message:
unique_ngrams_dict = defaultdict(lambda: 0)# just to show how I defined my dict
dict_file = codecs.open('ngram_dict', 'w', 'utf-8')
cPickle.dump(unique_ngrams_dict,dict_file)
dict_file.close()
error message:
Traceback (most recent call last):
File "Generate_NGram.py", line 81, in <module>
save_ngram_dict(unique_ngrams_dict)
File "Generate_NGram.py", line 70, in save_ngram_dict
cPickle.dump(unique_ngrams_dict,dict_file)
File "/usr/lib/python2.6/copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle function objects
thanks
Pickle is a binary format, so you shouldn’t open the file with any codecs, just:
It’s not a reason it’s failing, just quite inefficient.
The actual problem is the object you are trying to save contains a function reference
(the default value
lambda: 0) and pickle format does not support serializing functions.You’ll have three options:
dictand use it’s.getmethod with default argument.Set
before pickling and set it back to
after unpickling.
Define a class like:
and use
NgramDefault()as the default factory instead oflambda: 0.