Using SQL Server 2005 and 2008.
I’ve got a potentially very large table (potentially hundreds of millions of rows) consisting of the following columns:
CREATE TABLE (
date SMALLDATETIME,
id BIGINT,
value FLOAT
)
which is being partitioned on column date in daily partitions. The question then is should the primary key be on date, id or value, id?
I can imagine that SQL Server is smart enough to know that it’s already partitioning on date and therefore, if I’m always querying for whole chunks of days, then I can have it second in the primary key. Or I can imagine that SQL Server will need that column to be first in the primary key to get the benefit of partitioning.
Can anyone lend some insight into which way the table should be keyed?
As is the standard practice, the Primary Key should be the candidate key that uniquely identifies a given row.
What you wish to do, is known as Aligned Partitioning, which will ensure that the primary key is also split by your partitioning key and stored with the appropriate table data. This is the default behaviour in SQL Server.
For full details, consult the reference Partitioned Tables and Indexes in SQL Server 2005