I have big enough and already working project, written not by a programmer, but a scientist. The program holds lot of data in a huge object tree. Almost all of the classes involved are mutable and even more — they tend to change state of other objects in almost all methods, even getters and setters. This object tree is than stored via java serialization to disk.
My task is to migrate from serialization to database, to reduce memory consumption. Manual refactoring of this tons of mutables, changing each other at some unknown points in time is completely hell, especially for me — not knowing the subject domain.
Are there any approaches, practices, or refactoring patterns for such cases?
I think you have answered your question in the subject line. Using JPA or Hibernate with Annotations, you can take the existing object model and treat each object class as it’s own table. Don’t worry about constraints to start with.
In every class, add an id property, so that it can be stored, and implement
equalsandhashCode. (so you’ll have to figure out what makes an object equal to another). This will help avoid the creation of duplicates in your database.You will need to write code to persist your existing records. Perhaps the Visitor pattern would be a good fit here. To each domain object you add a method which takes a visitor and invokes it to persist the object.
After the initial persisting of the records, when you next load in your root object from Session or EntityManager, Hibernate will take care of loading all the referenced instances.
Over time, you can move business logic out of the model classes, but you won’t have to do this to start with. Hibernate will commit any changes made to any of the instances in the object model, such as those made by the existing code.
(Alternatively I guess you could hook into the deserialization of the object tree in order to persist or lookup the objects as they are deserialized. I’ll leave that as an exercise for another user.)