Let’s say I have a database for storing (numeric) datapoints. Datapoints are grouped together into observations. Each datapoint belongs to one or more observations and each observation has one or more datapoints. So, I have three tables:
CREATE TABLE `data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time` datetime NOT NULL,
`value` int(11) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB ;
CREATE TABLE `obs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`datetime` datetime NOT NULL,
`posthoc` tinyint(1) NOT NULL,
`comments` varchar(500) NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB ;
CREATE TABLE `on_obs_data` (
# linker or bridge table or whatever these are called
`id_obs` int(11) NOT NULL,
`id_data` int(11) NOT NULL,
KEY `id_obs` (`id_obs`),
KEY `id_data` (`id_data`),
CONSTRAINT `on_obs_data_ibfk_1` FOREIGN KEY (`id_obs`) REFERENCES `obs` (`id`),
CONSTRAINT `on_obs_data_ibfk_2` FOREIGN KEY (`id_data`) REFERENCES `data` (`id`)
) ENGINE=InnoDB ;
The problem is, how do I populate these three tables from a single spreadsheet (or as the case may be, a single interim table populated via LOAD DATA LOCAL INFILE)? I can populate data and obs individually with no problems, but on_obs_data needs to know the IDs of the newly created entries in the two tables. None of the information between data and obs overlaps, and the entries in the respective are not guaranteed to be unique other than ids which are generated by the database on insert. The only thing linking a given data entry to a given obs entry is the fact that they were originally on the same row of a spreadsheet.
I’m looking for solutions that can be implemented inside MySQL without relying on client-side scripting.
I’m surprised there isn’t a clean or well-publicized pattern for this, given that this is critical for referential integrity in a normalized database, but here is what I’ve come up with:
dataandobstables have one extra field besides the ones in the example code above. Let’s call ittempID. Make sure this field is permitted to have a NULL value.dataandobstables fields selected from this table normally and have the ID field from the interim table go into the respectivetempIDfields of thedataandobstables.insert into on_obs_data (id_obs,id_data) select obs.id,data.id from obs,data where obs.tempID is not NULL and data.tempID is not NULL and obs.tempID = data.tempIDupdate obs set tempID = NULL; update data set tempID = NULL;* I intentionally said ‘interim table’ rather than ‘temporary table’ because apparently MySQL doesn’t permit temporary tables to have autoincrementing ID fields. ಠ_ಠ
But something still troubles me– I would think this would be one of the first problems anybody trying to update a normalized database would run into. The kneejerk assumption would be that “MySQL is stupid” or “these MySQL gurus don’t know much” but I’ve learned that when I’m tempted to make that assumption it’s often me overlooking something obvious that everyone else knows. So, MySQL community, have I just reinvented the wheel? Is there some simpler way the rest of you update bridge tables? Or am I using the wrong terminology and nobody could answer this question because nobody understood it?