I have a seemingly simple requirement, but i can’t figure out how to write it as a query that only has one round trip to the server.
Basically i have a simple table
CREATE TABLE Item
(
id int not null identity(1,1),
create datetime not null,
close datetime --null means not closed yet
);
and what i want to do is over a range of time (say 1/1/2010 to 6/1/2010), for each month i need the number of items that were active in that month. an item is active if it was created either during or before that month and is either not closed (i.e. closed is null) or was closed after that month. So i translated that into a linq expression using a helper method:
//just returns the first day of every month inbetween min and max (inclusive)
private IEnumerable<DateTime> EnumerateMonths(DateTime Min, DateTime Max)
{
var curr = new DateTime(Min.Year, Min.Month, 1);
var Stop = new DateTime(Max.Year, Max.Month, 1).AddMonths(Max.Day == 1 ? 0 : 1);
while(curr < Stop)
{
yield return curr;
curr = curr.AddMonths(1);
}
}
public List<DataPoint> GetBacklogByMonth(DateTime min, DateTime max)
{
return EnumerateMonths(min, max)
.Select(m => new DataPoint
{
Date = m,
Count = DB.Items.Where(s => s.Create <= m.AddMonths(1) && (!s.Close.HasValue || s.Close.Value >= m.AddMonths(1)))
.Count()
}
).ToList();
}
which works perfectly, except each Count is a separate query so its super slow (a round trip for each month), so my question is how could i restructure this query to do this in one round trip to the server.
Initially i thought about doing some sort of group by so aggregate by month, but because each item could be ‘active’ in many different months i don’t think that would work.
Any suggestions?
I hate to answer my own question but here is what i did.
What i really needed to do all along was a left join with a table of months then do a group and a count on the number of items for each month. a normal grouping on month wouldn’t work because then items would only get counted in a single month not all the ones they were active for. So I added a table Months containing just dates of the first of the month and did a left join on it. This operation needs to be done often enough that i figured it was worth adding a table for it.
heres the final query:
I also added some code to make sure that months has the correct rows in it for my query.