I’m designing a product database where products can have very different attributes depending on their type, but attributes are fixed for each type and types are not manageable at all. E.g.:
magazine: title, issue_number, pages, copies, close_date, release_date
web_site: name, bandwidth, hits, date_from, date_to
I want to use InnoDB and enforce database integrity as much as the engine allows. What’s the recommended way to handle this?
I hate those designs where tables have 100 columns and most of the values are NULL so I thought about something like this:
product_type
============
product_type_id INT
product_type_name VARCHAR
product
=======
product_id INT
product_name VARCHAR
product_type_id INT -> Foreign key to product_type.product_type_id
valid_since DATETIME
valid_to DATETIME
magazine
========
magazine_id INT
title VARCHAR
product_id INT -> Foreign key to product.product_id
issue_number INT
pages INT
copies INT
close_date DATETIME
release_date DATETIME
web_site
========
web_site_id INT
name VARCHAR
product_id INT -> Foreign key to product.product_id
bandwidth INT
hits INT
date_from DATETIME
date_to DATETIME
This can handle cascaded product deletion but… Well, I’m not fully convinced…
This is a classic OO design to relational tables impedance mismatch. The table design you’ve described is known as ‘table per subclass’. The three most common designs are all compromises compared to what your objects actually look like in your app:
The design you don’t like – “where tables have 100 columns and most of the values are NULL” – is 2. one Table to store the whole specialization hierarchy. This is the least flexible for all kinds of reasons, including – if your app requires a new sub-class, you need to add columns. The design you describe accommodates change much better because you can add extend it by adding a new sub-class table described by a value in product_type.
The remaining option – 1. Table per concrete class – is usually undesirable because of the duplication involved in implementing all the common fields in each specialization table. Although, the advantages are that you wont need to perform any joins and the sub-class tables can even be on different db instances in a very large system.
The design you described is perfectly viable. The variation below is how it might look if you were using an ORM tool to do your CRUD operations. Notice how the ID in each sub-class table IS the FK value to the parent table in the hierarchy. A good ORM will automatically manage the correct sub-class table CRUD based on the value of the discriminator values in product.id and product.product_type_id alone. Whether you are planning on using an ORM or not, look at hibernate’s joined sub-class documentation, if only to see the design decisions they made.