I have some existing code that is querying a SQL database repeatedly with different parameters and I thought it would likely perform better if I changed it to select one big chunk of data into an ADODB.Recordset at the start, and then within the loop query this recordset rather than the database itself.
One additional caveat is that I need to use aggregate functions (SUM,MIN,MAX,AVG) when I am performing these sub-queries.
Coding this wouldn’t be too terribly difficult, but something this obvious seems like it would have been done thousands of times before, making me wonder if there might be an open source library of some sort out there that contains this type of functionality? I swear I encountered one a few years back but am unable to track it down on google.
EDIT:
A good suggestion (by TimW) in the comments was to do all the aggregation on the database server and pass back to the client, and then just do the filtering on the client.
(Although, in this case it won’t work, as 2 of the columns with filtering being applied are DateTime columns)
UPDATE
Here is the library I previously encountered:
http://code.google.com/p/ado-dataset-tools/
Not sure if the author has abandoned it or not (his plan seemed to be to update it and convert to c#), but the VBA versions of the various libraries seem to be available here:
http://code.google.com/p/ado-dataset-tools/source/browse/trunk/ado-recordset-unit-tests.xls?spec=svn8&r=8#ado-recordset-unit-tests.xls
The specific ADO library I was interested in is here:
http://code.google.com/p/ado-dataset-tools/source/browse/trunk/ado-recordset-unit-tests.xls/SharedRecordSet.bas
See specifically the GroupRecordSet() function.
Only SUM,MIN,MAX aggregate functions seem to be supported.
Another possible alternative (if running within Excel)
Writing SQL Queries Against Virtual Tables in Excel VBA
http://www.vbaexpress.com/forum/showthread.php?t=260
Not sure how this would perform, but pulling the raw data (with partial pre-aggregation) into a local worksheet in Excel, and then using that worksheet as a datasource in subsequent queries might be a viable option.
From my research into this subject, there is no easy solution or existing libraries or commercial products. The only viable solution from what I can tell is to bite the bullet and hand code a solution, which is more work than it’s worth to me.
So I am marking this as the correct answer despite it not being the solution to the problem. 🙂