I’m creating an SQL Server 2008 database that may contain millions of records and I was wondering if I need to define the following as indexes:
-
TINYINT column that may contain only 0 and 1?
-
TINYINT column that may contain only: 0, 5, and 6?
PS. Both of these columns will be used in the WHERE clause for selection.
No, an index on these columns alone basically will never be used.
But such low selectivity keys make great candidates for composite keys, placed as the leftmost column in the index. Eg.say the
TINYINT (0,1)(why not usebitbtw?) is thedeletedcolumn. You have frequent queries that predicate withWHERE deleted=0 AND .... Adding this as the leftmost column in the clustered index very often the proper approach. Or if the predicate is, say,WHERE name = '...' AND deleted=0you should make a non-clusteredindex on (deleted, name).Another option is to use a filtered index:
create index .. on (name) where (deleted=0)but this does not cover the case where you are interested in thedeleted=1.Same goes for a column with few distinct values like, say, a
typecolumn. Again, making it the leftmost key in composite indexes usually makes a lot of sense.Note though that if you add a low selectivity key as the leftmost key in an index and you do not specify this column in the predicate (eg.
WHERE name='...'w/o adding any criteria fordeleted) then the index cannot be used, only an indexon (name)(oron (name, ...)) could be used, ie. wherenameis the leftmost key.Why not make it righmost key? eg.
index on (name, deleted)? Because there’s usually no benefit, only if you want to enforce an unique constraint. With only 0 or 1 to choose from anindex on (name)or anindex on (name, deleted)basically offer the same performance (if they can be used). Placing the low selectivity key to the left enables some range scan scenarios (eg.WHERE type=5).