I’m writing a password salt/hash procedure for my .NET application, largely following the guide of this article:
http://www.aspheute.com/english/20040105.asp
Basically the code for computing the salted hash is this:
public string ComputeSaltedHash(string password, byte[] salt) {
// Get password ASCII text as bytes:
byte[] passwordBytes = System.Text.Encoding.ASCII.GetBytes(password);
// Append the two arrays
byte[] toHash = new byte[passwordBytes.Length + salt.Length];
Array.Copy(passwordBytes, 0, toHash, 0, passwordBytes.Length);
Array.Copy(salt, 0, toHash, passwordBytes.Length, salt.Length);
byte[] computedHash = SHA1.Create().ComputeHash(toHash);
// Return as an ASCII string
return System.Text.Encoding.ASCII.GetString(computedHash);
}
However, I want to allow allow users to use Unicode chars in their password, if they like. (This seems like a good idea; can anyone think of a reason it’s not?)
However, I don’t know a ton about how Unicode works, and I’m worried if I just change both references of System.Text.Encoding.ASCII to System.Text.Encoding.Unicode, the hash algorithm might produce some byte combinations that don’t form valid Unicode chars and the GetString call will freak out.
Is this a valid concern, or will it be OK?
You shouldn’t be using any normal encoding to convert from arbitrary binary data back to a string. It’s not encoded text – it’s just a sequence of bytes. Don’t try to interpret it as if it were “normal” text. Whether the original password contains any non-ASCII characters is irrelevant to this – your current code is broken. (I would treat the linked article with a large dose of suspicion simply on that basis.)
I would suggest:
Encoding.UTF8to get the bytes from the password. That will allow the password to contain any unicode character.Encoding.Unicodewould be fine here too.Convert.ToBase64Stringto convert from the computed hash back to text. Base64 is specifically designed to represent opaque binary data in text within the ASCII character set.