I’m using an existing database made by a third party with sqlalchemy. However, I’m having trouble as the tables do not have primar keys, and what’s worse, they have duplicate elements for each row, so I can’t pick an existing column as primary key. The tables have two columns: both have non-unique values.
I tried to monkey-patch the table as per http://www.blog.pythonlibrary.org/2010/09/10/sqlalchemy-connecting-to-pre-existing-databases/ but apparently this does not work (see below)
My current code is (MirnaTable is my mapped class, basically just a skeleton with nothing else)
connection = create_engine("sqlite:///targets.sqlite")
metadata = MetaData(bind=connection)
db_table = Table("miranda", metadata,
Column("id", Integer, primary_key=True),
autoload=True)
mapper(MirnaTable, db_table)
Session = sessionmaker(connection)
session = Session()
Then I try for example issuing
all_records = session.query(MirnaTable).all()
And I get
sqlalchemy.exc.OperationalError: (OperationalError) no such column: miranda.id
u'SELECT miranda.gene_id AS miranda_gene_id, miranda."mature_miRNA" AS
"miranda_mature_miRNA", miranda.id AS miranda_id \nFROM miranda' ()
So of course the id column isn’t found. Any ideas on what I’m doing wrong? Thanks in advance.
EDIT: As requested, here is an example from the table (retrieved directly from sqlite):
gene mature_miRNA
---- -------------
80205 hsa-miR-200c
80205 hsa-miR-200c
9693 hsa-miR-200c
9693 hsa-miR-200c
9881 hsa-miR-200c
9710 hsa-miR-200c
9750 hsa-miR-200c
You’ve misinterpreted the post you refer to. You have to choose an existing column and define it as primary. It’s also possible to setup composite primary key by putting them all in definition. In your case I think a gene has several mature microRNA, so the primary key should probably consist of
(gene_id, mature_miRNA)pair. Since there is no more fields in the table, there is no need inautoload=Trueflag.I don’t know the types of fields in your table, so change them appropriately if they are not integer.