This has been bugging me for a while and I’m hoping that one of the SQL Server experts can shed some light on it.
The question is:
When you index a SQL Server column containing a UDT (CLR type), how does SQL Server determine what index operation to perform for a given query?
Specifically I am thinking of the hierarchyid (AKA SqlHierarchyID) type. The way Microsoft recommends that you use it – and the way I do use it – is:
-
Create an index on the
hierarchyidcolumn itself (let’s call itID). This enables a depth-first search, so that when you writeWHERE ID.IsDescendantOf(@ParentID) = 1, it can perform an index seek. -
Create a persisted computed
Levelcolumn and create an index on(Level, ID). This enables a breadth-first search, so that when you writeWHERE ID.GetAncestor(1) = @ParentID, it can perform an index seek (on the second index) for this expression.
But what I don’t understand is how is this possible? It seems to violate the normal query plan rules – the calls to GetAncestor and IsDescendantOf don’t appear to be sargable, so this should result in a full index scan, but it doesn’t. Not that I am complaining, obviously, but I am trying to understand if it’s possible to replicate this functionality on my own UDTs.
Is hierarchyid simply a “magical” type that SQL Server has a special awareness of, and automatically alters the execution plan if it finds a certain combination of query elements and indexes? Or does the SqlHierarchyID CLR type simply define special attributes/methods (similar to the way IsDeterministic works for persisted computed columns) that are understood by the SQL Server engine?
I can’t seem to find any information about this. All I’ve been able to locate is a paragraph stating that the IsByteOrdered property makes things like indexes and check constraints possible by guaranteeing one unique representation per instance; while this is somewhat interesting, it doesn’t explain how SQL Server is able to perform a seek with certain instance methods.
So the question again – how do the index operations work for types like hierarchyid, and is it possible to get the same behaviour in a new UDT?
The query optimizer team is trying to handle scenarios that don’t change the order of things. For example,
cast(someDateTime as date)is still sargable. I’m hoping that as time continues, they fix up a bunch of old ones, such as dateadd/datediff with a constant.So… handling Ancestor is effectively like using the LIKE operator with the start of a string. It doesn’t change the order, and you can still get away with stuff.