Consider the following example. When I write “is-a”, I mean a column that is both a primary key for its table and a foreign key to another table to form a one-to-one relationship. When I write “has-a”, I mean a regular foreign key column, to form a one-to-many relationship.
- A fruitbasket table.
- An applebasket table, “is-a” fruitbasket.
- An orangebasket table, “is-a” fruitbasket.
- A fruit table, “has-a” fruitbasket.
- An apple table, “is-a” fruit.
- An orange table, “is-a” fruit.
Assume, for the time being, that there are context-specific columns in applebasket, orangebasket, apple and orange sufficient to warrant the existence of that table instead of cluttering the parent table with nullable columns or a type enumeration.
Questions:
- Is it better practice to relate between fruit and fruitbasket, or to relate apple and applebasket + orange and orangebasket? The former seems less redundant, but could potentially have invalid relations (apple -> fruit -> fruitbasket -> orangebasket, for example). The latter forces the relations to be valid, but is more redundant, and requires that any other inheriting fruit table declare its own basket foreign key.
- Specifically for PostgreSQL, given the first choice (relating fruit to fruitbasket), what is the simplest way for me to check relational validity? It would have to perform three joins.
- Any other suggestions to implement this cleanly?
Thanks…
I think you are looking at this somewhat wrong. Relational data modelling is about data while object modelling is about behavior. These are different disciplines and as much as I like to do object-relational data modelling has-a and is-a are not things that belong in the database. Instead look at functional dependencies and model them as such. Otherwise you can end up with problems if you ever have multiple apps trying to access the same data in different ways.
For example, suppose we have two applications. One pulls data out and manipulates it, and models behavior. The second pulls data out, treats it as static, and derives information. As yourself if the LSP allows you to say “a square is-a rectangle.” In the first case, no. In the second case, yes. In the first case you might want to use a has-a “rectangular_area” and in the second case “is-a rectangle” is perfectly valid.
So this brings me to my second point. If you are looking at this sort of complex relationship, how you do your mapping is likely to depend on what you are doing with your data. In general it is better to constrain your data based on definitional elements rather than behavioral elements. So in this case you have mappings wherever you need them. I would then suggest the following:
This brings me specifically to your questions:
Both. At the same time. See above.
Declarative referential integrity will take you all the way there. Don’t be afraid to use bidirectional foreign keys with one side set to DEFERRABLE INITIALLY DEFERRED.