I have a collection of physical parameters associated with different items. For example:
Item, p1, p2, p3
a, 1, 2, 3
b, 4, 5, 6
[...]
where px stands for parameter x.
I could go ahead and store the database exactly as presented; the schema would be
CREATE TABLE t1 (item TEXT PRIMARY KEY, p1 FLOAT, p2 FLOAT, p3 FLOAT);
I could retrieve the parameter p1 for all the items with the statement:
SELECT p1 FROM t1;
A second alternative is to have an schema like:
CREATE TABLE t1 (id INT PRIMARY KEY, item TEXT, par TEXT, val FLOAT)
This seems much simpler if you have many parameters (as I do). However, the parameter retrieval seems very awkward:
SELECT val FROM t1 WHERE par == 'p1'
What do you advice? Should go for the “pivoted” (first) version or the id, par, val (second) version?
Many thanks.
EDIT
For reference, I found the following persistence pattern in the SQLAlchemy examples site (the vertical mapping):
"""Mapping a vertical table as a dictionary.
This example illustrates accessing and modifying a "vertical" (or
"properties", or pivoted) table via a dict-like interface. These are tables
that store free-form object properties as rows instead of columns. For
example, instead of::
# A regular ("horizontal") table has columns for 'species' and 'size'
Table('animal', metadata,
Column('id', Integer, primary_key=True),
Column('species', Unicode),
Column('size', Unicode))
A vertical table models this as two tables: one table for the base or parent
entity, and another related table holding key/value pairs::
Table('animal', metadata,
Column('id', Integer, primary_key=True))
# The properties table will have one row for a 'species' value, and
# another row for the 'size' value.
Table('properties', metadata
Column('animal_id', Integer, ForeignKey('animal.id'),
primary_key=True),
Column('key', UnicodeText),
Column('value', UnicodeText))
Because the key/value pairs in a vertical scheme are not fixed in advance,
accessing them like a Python dict can be very convenient. The example below
can be used with many common vertical schemas as-is or with minor adaptations.
"""
In addition to the flexibility of the second approach, a further advantage is that parameters can then be rows in a parameter table, storing data about that parameter as part of the database, rather than as columns of the schema. It also leads naturally to an RDF triple representation of the data.
BTW you don’t need the added key field, make item and par a joint primary key
One limitation of the second approach is that the datatype of value must be the same for all parameters – OK if all floats but for generality this might have to be string with attendant loss of validation and the need for programatic data conversion.
Query speed will be affected, but you can get all the parameters for a term with a query like
which is easier to transform to a presentation format than the alternative.