In our inventory database (SQL Server 2008 std edition) we have a table (called Stock Results) that stores results for each stock item by stock period, that looks like this:
<< StockResults >>
PK StockPeriodID int
PK StockItemID int
OStockCost money
OStockQty real
DeliveriesQty real
CreditsQty real
TransfersInQty real
TransfersOutQty real
CStockQty real
OStockAmt money
DeliveriesAmt money
CreditsAmt money
TransfersInAmt money
TransfersOutAmt money
CStockAmt money
... except that it has about 40 columns
We are considering normalising that table, so that we have a table for fields and another for data. Like this:
create table StockResults_Fields
(FieldID int, FieldName varchar(20), FieldDataType varchar(10))
create table StockResults_Values
(StockPeriodID int, StockItemID int, FieldID int, FieldName varchar(20), FieldDataType varchar(10))
The reason we are considering doing that is to improve the performance of the table and to prevent deadlocks (which we are currently getting). The advice on normalizing to reduce deadlocks comes from this article: Reducing SQL Server Deadlocks.
My concerns are that the results table (which is already large), will get even bigger. And most of the reports display data in a structure that is similiar to the current structure — the new way will have quite a few more joins.
Before we start on something that will involve quite a lot of work, does anyone have any advice on this normalized structure for results and the performance benefits before we start?
EDIT: Thanks for the advice. I had a gut feeling that the 2-table approach wasn’t the way to go, but I wasn’t sure why — until now. The locking error has been solved: we had a table with no clustered index, but the snapshot isolation looks like something we might consider.
It sounds like you know all the needed columns at the time of designing the system. If that’s the case, you should absolutely not proceed with the design you proposed.
The only possible reason for that kind of design is if you don’t know all the fields you will need at design time, and need to add some after you are in production.
I would predict that your two-table approach would perform much worse than your current approach.
Also, this has nothing to do with normalization, at least by my definition. What you would be doing is moving away from a relational model and toward a metadata model.
(Edit: you should also post more info about when/where the deadlocks are occurring, if that is the root of the problem you are trying to solve).