I’m creating a simple business analytics application for the company I work at and have a 100mb csv file of sales transactions from the last 6 years that I’ve parsed into a number of databases.
One of the figures I want to display in my application are turnover and profit per customer per day, so I have a while loop that looks like the following:
while(start_date < current_date)
{
SELECT SUM(profit), SUM(turnover) FROM sales WHERE date = @date
}
So, I’m running that query on a table with a few hundred thousand rows as many times per day as there are customers every day for 6 years.
I was always led to believe any work that can be carried out in the DB, should be – which is why I’ve used the SUM queries to total the columns.
What other optimizations can I make? At the moment it is taking… well, I couldn’t tell you since it’s been running for the last hour on an 8GB, quad core server and looks like its at around 2-3%.
I’m using SQL Server Compact Edition and it’s a winforms application in C#/.NET but the db is obviously the bottleneck here.
try introducing summary table where you’ll have daily stats. update that summary table [with triggers or application logic] whenever you add some more data to the sales table.