I have a basic CRUD web app where people can create articles/edit them. I now want to add the ability to keep revision histories of all edits. Currently, I have an Articles table that looks like this:
Article(id, title, content, author_id, category_id, format)
I have considered 2 options for changing my current schema to add support for revision history. Basic idea is every single edit for any article is stored as a record in a Revision table. So Articles and Revisions is a One-to-many relationship.
1st option (normalized):
One table for article metadata, one for revisions. No duplicate data stored.
Article(id, title, category_id)
Revision(id, content, author_id, format)
2nd option (de-normalized):
Two tables like option 1 but with some duplicate columns.
Article(id, title, content, author_id, category_id, format)
Revision(id, article_id, content, author_id, format)
I’m thinking of going with the 2nd option because it will make my coding much easier (less complex, less lines of code). I know it isn’t “academic” and “pure” but my personal feeling is that having to do extra joins would hurt code maintenance. Also, performance should be better since not as many joins will have to be done.
Is this a sound way to go about this task? Possibly any unforeseen or long-term consequences I am overlooking?
The performance argument is nonsense – you are doing less
JOINs, but RDBMS are optimized forJOINs.However you are potentially pulling a lot more data from the server than is necessary, which can’t be optimized away.
You also potentially have a consistency issue. Duplicating data for the same item in different tables leads to the ability to have inconsistencies. What if the revision records and the article record have different values for
formatorauthor? How do you know which is correct? What if thecontentinArticlesdoesn’t match any of the revisions?You really should normalize this. I would add a
CurrentRevisionfield to yourArticlestable to link to the current version, and you should have anArticleIDin theRevisionstable to link the two together.