Consider the following SQL Server 2005/2008 query:
Select [UID], [DESC] From SomeTable Order By [Desc];
If Desc is a fairly long field (Varchar(125) with many entries > 70 chars) and you don’t need a strict sorting, would it would be more efficient to do this:
Select [UID], [DESC] From SomeTable Order By Substring([Desc], 0, 20);
The advantage is that all comparisons are pretty short (20 characters, max). The disadvantage is that it incurs the Substring call. For present purposes, assume that you don’t want to put an index on this field as this is not a primary key and the above is a fairly rare operation. Which option would you choose?
Note 2: I’m asking mostly out of curiosity here. In my application, Desc is an indexed field and I am not using Substring. However, I briefly considered using Substring and it occurred to me that I didn’t truly know which of the above approaches would be more efficient.
Finaly, a bonus question: is it true that using Substring on an Indexed field would make the optimizer skip the index and really slow things down? I don’t think the optimizer is smart enough to use the Index if Substring is used (even with a zero base) but I am a bit too busy to test it out right now. However, if you know differently, please correct me!
Update/clarification: you should be assuming that the Desc field is not indexed for purposes of the original question. If it is indexed, the answer is pretty easy.
Use of a
non-clustered indeximplies an implicitJOIN.The index itself does not contain the non-indexed values, it contains only references to the
TABLE‘s blocks.To get the non-indexed values, you need to scan over the index and read from these blocks in a nested loop.
As a rule of thumb,
INDEX SCAN WITH TABLE LOOKUPis about10times as costly as theTABLE SCAN.If you need all the results of an ordered query, especially as a part or a more complex query implying the
nested loops, it’s sometimes more efficient to perform aTABLE SCANand sort the results.Table needs to be sorted only once and results of the sort will be kept and reused. In this case,
SUBSTRINGmay be more efficient.If you need
5%of ordered results or less, then theINDEX SCANwill be more efficient, in this case you need to sort on the whole column.Also, index lookup is always more responsive, as you get the first rows faster.