SQL Server 2005.
In our application, we have an entity with a parent table as well as several child tables. We would like to track revisions made to this entity. After going back and forth, we’ve narrowed it down to two approaches to choose from.
-
Have one history table for the entity. Before a sproc updates the table, retrieve the entire current state of the entity from the parent table and all child tables. XMLize it and stick it into the history table as the XML data type. Include some columns to query by, as well as a revision number/created date.
-
For each table, create a matching history table with the same columns. Also have a revision number/created date. Before a sproc updates a single table, retrieve the existing state of the record for that one table, and copy it into the history table. So, it’s a little bit like SVN. If I want to get an entity at revision Y, I need to get the history record in each table with the maximum revision number that is not greater than Y. An entity might have 50 revision records in one table, but only 3 revision records in a child table, etc. I would probably want to persist the revision counter for the entire entity somewhere.
Both approaches seem to have their headaches, but I still prefer solution #2 to solution #1. This is a database that’s already huge, and already suffers from performance issues. Bloating it with XML blobs on every revision (and there will be plenty) seems like a horrible way to go. Creating history tables for everything is a cost I’m willing to eat, as long as there’s not a better way to do this.
Any suggestions?
Thanks,
Tedderz
Number 2 is almost certainly the way to go, and I do something like this with my history tables, though I use an “events” table as well to correlate the changes with one another instead of using a timestamp. I guess this is what you mean by a “revision counter”. My “events” table contains a unique ID, a timestamp (of course), the application user responsible for the change, and an “action” designator which represents the application-level action that the user made which caused the change to happen.
Why #2? Because you can more easily partition the table to archive or roll-off old entries. Because it’s easier to index. Because it’s a WHOLE lot easier to query. Because it has less overhead than XML and is a lot smaller.
Also, consider using triggers instead of coding a stored procedure to do all of this. Triggers are almost always to be avoided, but for things like this, they’re a fairly lightweight and robust way to perform this kind of thing.