Setup:
I was trying to understand the difference between identifying and non-identifying relationships when I found this great article on stackexchange. What's the difference between identifying and non-identifying relationships?
After reading a few comments it brought another question to mind about a problem I have been having.
Question:
Should I use multi-column primary keys on every child table and what are the advantages/disadvantages to doing so?
To better illustrate my question I have created an example below. I also included the comments that caused me to ask this question.
Example:
In my situation, I know the building_id and I need to get bed.data.
#1 – My current DB structure:
TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data }
floor { id, building_id, data }
room {id, floor_id, data }
bed {id, room_id, data }
This type of table structure would require me to use a few joins to get the data I need. Not a big deal but kind of a pain since I run into this situation a lot.
#2 – My interpretation of Bill Karwin’s suggested DB structure (see article comments below):
TABLE { FIELDS }
-----------------------------------------------------------------------
building { id, data }
floor { id, building_id, data }
room {id, building_id, floor_id, data }
bed {id, building_id, floor_id, room_id, data }
This table structure seems to eliminate the need for joins in my situation. So what are the disadvantages to this table structure? I really like the idea of not doing so many join statements.
Comments From Article:
What's the difference between identifying and non-identifying relationships?
@hobodave: It’s the “convention over configuration” argument. Some schools of thought are that every table should define its primary key for a single-column pseudokey named id that auto-generates its values. Application frameworks like Rails have popularized this as a default. They treat natural keys and multi-column keys as divergent from their conventions, needed when using “legacy” databases. Many other frameworks have followed this lead. – Bill Karwin Mar 10 ’10 at 23:06
It seems like “properly” constructing identifying relationships would lead to obnoxiously huge primary keys. e.g. Building has Floor has Room has Bed. The PK for Bed would be (bed_id, floor_id, room_id, building_id). It seem’s strange that I’ve never seen this in practice, nor heard it suggested as a way to do anything. That’s a lot of redundant data in the PK. – hobodave Mar 10 ’10 at 23:34
@hobodave: I have seen multi-column primary keys that are even larger. But I take your point. Consider that multi-column primary keys convey more information; you can query the Beds table for all beds in a specific building without doing any joins. – Bill Karwin Mar 11 ’10 at 1:00
This data is normalized
This table is not (bad idea)
selectonly indexes, the tables themselves will never be accessed. So why are you duplicating the indexes? MySQL will never need to read the main table anyway.floor_idin tablebedother than usingid+building_id+floor_id, This means that you may have to use much more key-space than needed in Model A. Either that or you need to add an extra index (which will drag around a full copy of the PK).In short
I see absolutly zero benefit and a whole lot of drawbacks in Model B, never use it!