I’m planning software that’s an OLAP application at its heart (it helps analyse metering data) and is going to have some kind of star schema for its database, because the stored values will be looked at from different angles (time, source, type etc.) and the requests will be asking for aggregated data along these dimensions. The queries tend to deliver a lot of rows (up to some 100 000).
My research on this topic (see also my question here) seems to indicate that bitmap indices are a good way to search for data the way I’m planning to. However, I want to support multiple db engines, some of which do not offer bitmap indices on their tables (in particular, MySQL).
Now, I can certainly build and maintain my own bitmap index and use it to look for row ids pointing to the fact table. However, I suspect that this is going to defeat the whole purpose of the index, because the database is still going to search for row ids in a B-Tree. Could somebody with more profound theoretical background or more experience tell me if I still gain anything, like not having to do slow JOINs on the dimension tables?
I would also appreciate hints on what I have to evaluate if the answer is not straightforward.
Some DB engines that do not directly support bitmap indexes still have star optimisations that can do this type of query without hitting the fact table. SQL Server, for instance has a feature called Index Intersection that does something similar by constructing bitmaps on the fly to do the resolution. Microsoft claims that the performance of this is comparable to bitmap indexes. See This posting for a bit of fan-out on this topic.
I’m not sure off the top of my head if MySQL does this, but Postgresql certainly does. IIRC some of the variants (Greenplum, I think) also directly support bitmap indexes and there was some talk of incorporating it in the main DB engine. I don’t recall if this has been done yet.
I think you will find that most modern DBMS platforms offer star query optimisations of one sort or another, so you probably don’t need to re-invent the wheel. You may find one or two that cannot do this, but you always have the option of just not supporting them.