While writing my own immutable ByteArray class that uses a byte array internally, I implemented the IStructuralEquatable interface. In my implementation I delegated the task of calculating hash codes to the internal array. While testing it, to my great surprise, I found that my two different arrays had the same structural hash code, i.e. they returned the same value from GetHashCode. To reproduce:
IStructuralEquatable array11 = new int[] { 1, 1 };
IStructuralEquatable array12 = new int[] { 1, 2 };
IStructuralEquatable array22 = new int[] { 2, 2 };
var comparer = EqualityComparer<int>.Default;
Console.WriteLine(array11.GetHashCode(comparer)); // 32
Console.WriteLine(array12.GetHashCode(comparer)); // 32
Console.WriteLine(array22.GetHashCode(comparer)); // 64
IStructuralEquatable is quite new and unknown, but I read somewhere that it can be used to compare the contents of collections and arrays. Am I wrong, or is my .Net wrong?
Note that I am not talking about Object.GetHashCode!
Edit:
So, I am apparently wrong as unequal objects may have equal hash codes. But isn’t GetHashCode returning a somewhat randomly distributed set of values a requirement? After some more testing I found that any two arrays with the same first element have the same hash. I still think this is strange behavior.
What you have described is not a bug.
GetHashCode()does not guarantee unique hashes for nonequal objects.From MSDN:
EDIT
While the MSFT .NET implementation of
GetHashCode()forArray.IStructuralEquatableobeys the principles in the above MSDN documentation, it appears that the authors did not implement it as intended.Here is the code from “Array.cs”:
Notice in particular this line:
ret = CombineHashCodes(ret, comparer.GetHashCode(GetValue(0)));Unless I am mistaken, that
0was intended to bei. Because of this,GetHashCode()always returns the same value for arrays with the same max(0, n-8th) element, where n is the length of the array. This isn’t wrong (doesn’t violate documentation), but it is clearly not as good as it would be if0were replaced withi. Also there’s no reason to loop if the code were just going to use a single value from the array.