We have a giant SQL Server 2005 database (75GB) which basically is just data in one table with sales values (per day, store and article). We want to make place by adding up the sales value per week for every record older than a year (still grouped per store and article). So in theory for data older than a year we can then remove 6 out 7 records.
Writing a procedure to do this is not really a problem but it runs like forever. So I was looking for a strategy that could make this run in a reasonable amount of time.
To give you an idea: running SELECT count(*) ran for over 4 minutes
We do have a few indexes (on date (clustered) and on the store, article and date combination). Adding any more indexes also takes like forever.
Anyone has got a good strategy on how to perform this task? Any suggestions on TSQL methods which perform better than the basic DML statements?
If You use SQL Server 2005 Enterprise Edition, You should consider using partitioning feature. Benefits:
If You don’t use Enterprise Edition, use this link to see partitioning (sharding or horizontal partitioning) capabilities not based on SQL Server 2005 partitioning feature.
For stored procedure optimization:
Off-topic hint: if using Enterprise Edition, consider compressing your table since SQL Server 2005 is usually good at compression of fact tables – you will probably gain both performance and disk space if You have enough CPU power.