In a data warehouse, are there disadvantages to creating clustered indexes on fact tables? (most of the time, it will be on the datetime column)
Would you answer yes or no “by default…”?
If I shouldn’t create clustered indexes by default, then why? (I know the pros of clustered indexes, but what are some cons?)
References
http://blogs.sqlserver.org.au/blogs/greg_linwood/archive/2006/09/11/365.aspx
I would always suggest having a clustered index on a table (transaction or warehouse) that is searched by a given value frequently. The downside to a clustered index (or any index) is that you are creating an additional store of data that takes up space. If the table that is being indexed is huge…the index will be too! The more indexes you have the more data you are storing in addition to the database. However, if you need speed for your searches then you may need an index to help gain that speed.
However, you may also look to creating a clustered index on the ID of your table. And then create indexes outside of the database in a product such as Lucene (or Lucene.NET). Then you can search your Lucene index (which has way more flexibility and features when it comes to searching) which would return the ID of a given record (or records) which you can then use to identify the data that you need in your database. This is a route that we have used quite a bit in my current project and I must admit it works quite slick! Creating the indexes is considerably faster (especially when compared to using FullText options in SQL Server). Just something to consider.