I want to represent documents in a database. There are several different types of documents. All documents have certain things in common, but not all documents are the same.
For example, let’s say I have a basic table for documents…
TABLE docs (
ID
title
content
)
Now let’s say I have a subset of documents that can belong to a user, and that can have additional info associated with them. I could do the following…
TABLE docs (
ID
userID -> users(ID)
title
content
additionalInfo
)
…however this will result in a lot of null values in the table, as only some documents can belong to a user, not all. So instead I have created a second table “ownedDocs” to extend “docs”:
TABLE ownedDocs (
docID -> docs(ID)
userID -> users(ID)
additionalInfo
)
I am wondering: Is this the right way to do it? (I am worried because while everything is in one table, I have a one-to-many relationship between docs and users. However, by creating a new table ownedDocs, the datastructure looks like I have a many-to-many relationship between docs and users – which will never occur.)
Thanks in advance for your help
If you make
OwnedDocs.DocIdthe primary key it will be quite clear that a 1:N relationship is impossible.The modelling of zero or one to one relationships is tricky. If we have just the one sub-type then the single table with NULL columns is a reasonable approach. However it is good practice to ensure that the sub-types attributes are only populated when appropriate. In the given example that would mean a check constraint to enforce this rule:
Or maybe even this rule:
The relationship between attributes won’t show up in an ERD (unless you use a naming convention). For sure, the mandatory nature of
AdditionalInfofor owned documents won’t be obvious in the second case.Once we have several such sub-types the case for separate tables becomes compelling, especially if the sub-types constitute an arc e.g. a Document can be a FinancialDocument or a MedicalDocument or a PersonnelDocument but not more than one category. I once implemented such a model using a single table with lots of null columns, views and check constraints. It was horrible. Sub-type tables are definitely the way to go.