Are there any specialized databases – rdbms, nosql, key-value, or anything else – that are optimised for running fast aggregate queries or map-reduces like this over very large data sets:
select date, count(*)
from Sales
where [various combinations of filters]
group by date
So far I’ve run benchmarks on MongoDB and SQL Server, but I’m wondering if there’s a more specialized solution, preferably one that can scale data horizontally.
For certain kinds of data (large volumes, time series) kx.com provides probably the best solution: kdb+. If it looks like your kind of data, give it a try. Note: they don’t use SQL, but rather a more general, more powerful, and more crazy set-theoretical language.