When my python application starts, I would like to load all of the data from in.db and put it into out.db (and then perhaps make changes in out.db). I use session.merge(loaded_object), but the problem is that it does not save related objects.
My data is simple Person objects, with obvious parent-child relations between them (many to many):
from sqlalchemy import create_engine, Column, String, Integer, ForeignKey
from sqlalchemy.orm import sessionmaker, relationship, backref
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Person(Base):
__tablename__ = "people"
id = Column(Integer, primary_key=True)
name = Column(String)
def __init__(self, name):
self.name = name
def add_kid(self, kid):
Edge(kid=kid, parent=self)
return self
def get_kids(self):
return [edge.kid for edge in self.kid_edges]
def add_parent(self, parent):
Edge(kid=self, parent=parent)
return self
def get_parents(self):
return [edge.parent for edge in self.parent_edges]
def __repr__(self):
return "<Person(id={id}, name={name})>".format(id=str(self.id),
name=self.name)
class Edge(Base):
__tablename__ = "edges"
id = Column(Integer, primary_key=True)
kid_id = Column(Integer, ForeignKey("people.id"))
parent_id = Column(Integer, ForeignKey("people.id"))
kid = relationship("Person", primaryjoin="Edge.kid_id==Person.id",
backref=backref("parent_edges",
collection_class=set))
parent = relationship("Person", primaryjoin="Edge.parent_id==Person.id",
backref=backref("kid_edges",
collection_class=set))
def __init__(self, kid, parent):
self.kid = kid
self.parent = parent
I initialize the sessions via:
db_in_engine = create_engine("sqlite:///in.db", echo=True)
db_in_session_factory = sessionmaker(bind=db_in_engine)
db_in_session = db_in_session_factory()
db_out_engine = create_engine("sqlite:///out.db", echo=True)
db_out_session_factory = sessionmaker(bind=db_out_engine)
db_out_session = db_out_session_factory()
Base.metadata.create_all(db_out_engine)
The problem is that when I merge a person, the kids are not merged:
people = db_in_session.query(Person).all()
db_out_session.merge(people[0])
db_out_session.commit() # related Edges, kids and parents of people[0] are not saved
I’ve tried to add cascade=”merge” to the relationships and the backrefs, but that did not work. Is there any way I could force it to save all of people[0]’s kids/parents and relevant Edges?
First off, don’t feel bad as I had to test this to see why it doesn’t work, and I wrote the thing.
The merge() use case is one where you’re taking some kind of in-application data, either from an offline cache or some locally modified structure, and moving it into a new Session. merge() is mostly about merging changes, so when it sees attributes that have no “change”, it assumes no special work is needed. So it skips unloaded relationships. If it did follow unloaded relationships, the merge process would become a very slow and burdensome operation as it traverses the full graph of relationships loading everything recursively, potentially loading a significant portion of the database into memory for a highly interlinked schema. The “copy from one database to another” use case here was not anticipated.
the data does go in if you just make sure all those edges are loaded ahead of time, here’s a demo. the default cascade is “save-update, merge” also so you don’t have to specify that.