We are currently doing an web application which one of the functionality is to create Events by the user. Those events can be later on deleted by the user or administrator. However the client requires that the event is not really physically deleted from database, but just marked as deleted. User should only see non-deleted events, but administrators should be able to browse also through deleted ones. That’s all really the functionality there is.
Now I suggested that we should simply add one more extra column called “status”, which would have couple of valid values: ACTIVE and DELETED. This way we can distinguish between normal(active) and deleted events and create really simple queries (SELECT * FROM EVENTS WHERE STATUS = ‘ACTIVE’).
My colleague however disagreed. He pointed out that regardless of the fact that right now active events and deleted events share same information (thus they can be stored in the same table) in a future requirements my change and client for example will need to store some additional information about deleted Event (like date of deletion, who deleted it, why he did it – sort of comment). He said that to fulfil those requirements in a future we would have to add additional columns in EVENTS table that would hold data specific for the deleted Events and not for active events. He proposed a solution, where additional table is created (like DELETED_EVENTS) with same schema as EVENTS table. Every deleted event would be physical deleted from EVENTS table and be moved to DELETED_EVENTS table.
I strongly disagreed with his idea. Not only would it make SQL query more complex and less efficient but also this totally is against YAGNI. I also disagreed with him that my idea would made us to create additional (not nullable) columns in EVENTS table, if the requirements changed in a future. In my scenario I would simply create new table like DELETED_EVENTS_DATA (that would hold those additional, archive data) and would add reference key in the EVENTS table to maintain one to one relationship between EVETNS and DELETED_EVENTS_DATA tables.
Nevertheless I was struggled by the fact that two developers who commonly share similar view on software and database design could have so radically different opinions about how this requirements should be designed in a database level. I thought that we maybe both going in a wrong direction and there is another (third) solution? Or are there more then just one alternative?
How do you design this sort of requirements? Are there any patterns or guidelines on how to do it properly? Any help will be deeply appreciated
OK the way we handle it is as follows.
We have an extra column on every table called ‘Deleted’ this is a bit field. Then as you rightly have said your queries are quite simple as its just a where clause to filter them out or leave them in. Only thing you need to make sure is that any reporting or stats that you generate filter out the deleted records.
Then for the extra info that you are talking about wanting to capture, just this extra info s would go in a separate ‘audit’ like table. In our case we have made this extra table quite generic and it can hold this audit info for any table… see below how it works…
Now if you have other entities you want to capture (like Location – where Location is a table) as well it would look like this…
Then when you want to get out the extra audit data you are talking about its quite simple. The query would look something like this
Also this audit table can capture other events and not just deletes… This is done via using the ActionTypeId column. At the moment it just has 1 (which is delete), but you could have others as well.
Hope this helps
EDIT:
On top of this if we have strong Audit requirements we do the following… None of the above changes but we create a second database called ‘xyz_Audit’ which captures the pre and post for every action that happens within the database. This second database has the same schema as the first database (without the Audit table) except that every table has 2 extra columns.
The first extra column is a PrePostFlag and the second column is the AuditId. Hence the primary key now goes across 3 columns, ‘xyzId’, ‘PrePostFlag’ and ‘AuditId’.
By doing this we can give the admins full power to know who did what when, the data that changed and how it changed and to undelete a record we just need to change the deleted flag in the primary database.
Also by having this data in a different database it allows us to have different optimization, storage and management plans to the main transnational database.