I have been using Python’s pickle
module for implementing a thin file-based persistence layer. The
persistence layer (part of a larger library) relies heavily on pickle’s persistent_id feature
to save objects of specified classes as separate files.
The only issue with this approach is that pickle files are not human
editable, and I’d much rather have objects saved in a format that is
human readable and editable with a text editor (e.g., YAML or JSON).
Do you know of any library that uses a human-editable format and
offers features similar to pickle‘s persistent_id? Alternatively,
do you have suggestions for implementing them on top of a YAML- or
JSON-based serialization library, without rewriting a large subset of
pickle?
I haven’t tried this yet myself, but I think you should be able to do this elegantly with PyYAML using what they call “representers” and “resolvers”.
EDIT
After an extensive exchange of comments with the poster, here is a method to achieve the required behavior with PyYAML.
Important Note: If a
Persistableinstance has another such instance as an attribute, or contained somehow inside one of its attributes, then the containedPersistableinstance will not be saved to yet another separate file, rather it will be saved inline in the same file as the parentPersistableinstance. To the best of my understanding, this limitation also existed in the OP’s pickle-based system, and may be acceptable for his/her use cases. I haven’t found an elegant solution for this which doesn’t involve hackingyaml.representer.BaseRepresenter.From now on use
my_yaml_dumpinstead ofyaml.dumpwhen you want to save instances of thePersistableclass to separate files. But don’t use it insidepersistable_representerandpersistable_constructor! No special loading function is necessary, just useyaml.load.Phew, that took some work… I hope this helps!