I have several tables that must be joined based on start and stop dates, for instance …
SELECT *
FROM [t1]
JOIN [t2]
ON [t2].[start] BETWEEN [t1].[start] AND [t1].[stop]
OR [t2].[stop] BETWEEN [t1].[start] AND [t1].[stop]
OR [t2].[start] < [t1].[start] and [t2].[stop] > [t1].[stop]
These tables could be in the multiple-millions of rows.
I have the option to store the start/stop as datetime2 and/or bigint(nanoseconds since epoch). Given this option – would there be a performance gain by using the bigint if indexed?
Any pros/cons with either approach?
I would expect the integer approach to be faster –
Use
datetime2. In addition to “easier to read by human”, easier to program by a human. If you don’t use a date type to represent a date, you’ll have problems computing intervals, joining to other tables, comparing to external times e.g.getdate().As far as efficiency goes, for purposes of comparison the server almost certainly treats all date types as integers. The only speed advantage that might be open to you, depending on whether you usually care about the seconds, would be clustering on
smalldatetime, because it’s 4 bytes. You could still keep the high-resolution part as a separatereal.