Hello,
I’m currently trying to apply the most efficient way to store an “extend” relationship between entities in a relational database.
For the sake of example, lets say we have the following simplified entities:
UserStudent(extendsUser)Teacher(extendsUser)
User contains attributes which apply to both Student and Teacher. Both Student and Teacher contain custom attributes which are unique to them.
The first thing that comes to mind is to create a single table with columns for all singular data (i.e. except one-to-many fields):
User
-------------
User ID
First name
Last name
Student class
Teacher office no.
Teacher description
...
This however won’t be very efficient from a storage perspective, because:
- the majority of rows will contain students, with a small number of teachers
- teachers will have much more unique columns, which would waste space in students’ rows
It would be more efficient to replicate relationships between the entities:
User
-------------
User ID
First name
Last name
...
Student
-------------
User ID
Student class
...
Teacher
-------------
User ID
Teacher office no.
Teacher description
...
So my questions are:
- Is the above concern taking it too far, i.e. should we leave storage efficiency to the database engine?
- Is splitting the entities into 3 tables still OK in terms of normalization?
- If it’s not a good approach, how would you recommend to treat “extend” relationships in a relational database?
Thank you.
If a user can’t be both a teacher and a student, then you’re looking at a straightforward supertype/subtype problem. (I’m using supertype and subtype in their relational database design sense, not in their object-oriented programming sense.) You’re right to store in “students” only those attributes that describe students, and to store in “teachers” only those attributes that describe teachers.
At this point, you’d also do whatever your dbms requires to make these two views updatable—triggers, rules, whatever. Application code inserts, updates, and deletes from the views, not from the base tables.