I have a relatively simple subset of tables in my database for tracking something called sessions. These are academic sessions (think offerings of a particular program). The tables to represent a sessions information are:
sessions
session_terms
session_subjects
session_mark_item_info
session_marks
All of these tables have their own primary keys, and are like a tree, in that sessions have terms, terms have subjects, subjects have mark items, etc. So each on would have at least its “parent’s” foreign key.
My question is, design wise is it a good idea to include the sessions primary key in the other tables as a foreign key to easily select related session items, or is that too much redundency?
If I include the session foreign key (or all parent foreign keys from tables up the heirarchy) in all the tables, I can easily select all the marks for a session. As an example, something like
SELECT mark FROM session_marks WHERE sessionID=...
If I don’t, then I would have to combine selects with something like
WHERE something IN (SELECT...
Which approach is “more correct” or efficient?
Thanks in advance!
The second approach is more correct. And actually to pull the session information you would join tables, do not be afraid to JOIN that is the whole point of relational databases. You do not want to repeat yourself (normalization). So you would only keep a reference to the parent, and not the parent.parent.
This question comes up a lot for beginners, they think creating the same key in sub sub tables is going to make their life easier as there select can then just become:
The problem is you are introducing repeatitive data inside of tables that probably do not need to know their parents parent. In fact, somewhere in your design you can find this information by Joining to another table. For instance:
Once you are at that point you can then find the session id i nthe parent table. So do NOT repeate columns in tables that are related.