In SQL Server (2005+) I need to index a column (exact matches only) that is nvarchar(2000+). What is the most scalable, performant way to approach this?
In SQL Server (2005+), what would be the practical difference in indexing on a column with the following types:
nvarchar(2000)char(40)binary(16)
E.g. would a lookup against an indexed binary(16) column be measurably faster than a lookup against an indexed nvarchar(2000)? If so, how much?
Obviously smaller is always better in some regard, but I am not familiar enough with how SQL Server optimizes its indexes to know how it deals with length.
You’re thinking about this from the wrong direction:
Whether a column is a
binary(16)ornvarchar(2000)makes little difference there, because you don’t just go add indexes willy nilly.Don’t let index choice dictate your column types. If you need to index an
nvarchar(2000)consider a fulltext index or adding a hash value for the column and index that.Based on your update, I would probably create either a checksum column or a computed column using the
HashBytes()function and index that. Note that a checksum isn’t the same as a cryptographic hash and so you are somewhat more likely have collisions, but you can also match the entire contents of the text and it will filter with the index first. HashBytes() is less likely to have collisions, but it is still possible and so you still need to compare the actual column. HashBytes is also more expensive to compute the hash for each query and each change.