I am wondering if there is a good-performing query to select distinct dates (ignoring times) from a table with a datetime field in SQL Server.
My problem isn’t getting the server to actually do this (I’ve seen this question already, and we had something similar already in place using DISTINCT). The problem is whether there is any trick to get it done more quickly. With the data we are using, our current query is returning ~80 distinct days for which there are ~40,000 rows of data (after filtering on another indexed column), there is an index on the date column, and the query always manages to take 5+ seconds. Which is too slow.
Changing the database structure might be an option, but a less desirable one.
Every option that involves CAST or TRUNCATE or DATEPART manipulation on the datetime field has the same problem: the query has to scan the entire resultset (the 40k) in order to find the distinct dates. Performance may vary marginally between various implementaitons.
What you really need is to have an index that can produce the response in a blink. You can either have a persisted computed column with and index that (requires table structure changes) or an indexed view (requires Enterprise Edition for QO to consider the index out-of-the-box).
Persisted computed column:
Indexed view:
Update
To completely eliminate the scan one could use an GROUP BY tricked indexed view, like this:
The query
select distinct date_only from foowill use this indexed view instead. Is still a scan technically, but on an already ‘distinct’ index, so only the needed records are scanned. Its a hack, I reckon, I would not recommend it for live production code.AFAIK SQL Server does not have the capability of scanning a true index with skipping repeats, ie. seek top, then seek greater than top, then succesively seek greater than last found.