In my database I have a computed column that contains a SHA1 hash of a column called URLString which holds URLs (e.g. “http://xxxx.com/index.html”).
I often need to query the table to find a specific URL based on the URLString column.
The table contains 100K’s and these queries take several seconds (using SQL Azure).
Since URLs can be quite long, I cannot create an index on this column (above 450 bytes).
To speed things up I want to calculate the equivalent of SQL Server hashbytes(‘SHA1’,[URLString]) from C# and query based on this value.
I tried the below code, but the value I get is different than the one calculated by the database.
var urlString = Encoding.ASCII.GetBytes(url.URLString); //UTF8 also fails
var sha1 = new SHA1CryptoServiceProvider();
byte[] hash = sha1.ComputeHash(urlString);
Am I missing something trivial here?
I’m open to other ideas that can solve the same problem (as long as they are supported by SQL Azure).
Example: in the database the automatically calculated SHA1 value of URL http://www.whatismyip.org/ is 0xAE66CA69A157186A511ED462153D7CA65F0C1BF7.
You’re likely getting bitten by character encoding differences:
http://weblogs.sqlteam.com/mladenp/archive/2009/04/28/Comparing-SQL-Server-HASHBYTES-function-and-.Net-hashing.aspx
You could try getting the bytes via
Encoding.ASCII.GetBytes(url)orEncoding.Unicode.GetBytes(url)and see which one your db is using.