If you run the following on HotSpot Java 7 64-bit version.
int countTopBit = 0, countLowestBit = 0;
for (int i = 0; i < 100000000; i++) {
int h = new Object().hashCode();
if (h < 0)
countTopBit++;
if ((h & 1) == 1)
countLowestBit++;
}
System.out.println("The count of negative hashCodes was " + countTopBit + ", the count of odd hashCodes was " + countLowestBit);
you can get a result like
The count of negative hashCodes was 0, the count of odd hashCodes was 49994232
I was wondering if this means the Object.hashCode() is only really 31-bit and why this might be so?
It is not the case that the top bit is not used. From the source for HashMap
257 /**
258 * Applies a supplemental hash function to a given hashCode, which
259 * defends against poor quality hash functions. This is critical
260 * because HashMap uses power-of-two length hash tables, that
261 * otherwise encounter collisions for hashCodes that do not differ
262 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
263 */
264 static int hash(int h) {
265 // This function ensures that hashCodes that differ only by
266 // constant multiples at each bit position have a bounded
267 // number of collisions (approximately 8 at default load factor).
268 h ^= (h >>> 20) ^ (h >>> 12);
269 return h ^ (h >>> 7) ^ (h >>> 4);
270 }
HotSpot supports a variety of hashing algorithms for
Object. As you’ve discovered emprically, the top bit is always masked out before the result is returned:markOopDesc::hash_maskis computed as follows:As you can see,
markOopDesc::hash_maskalways has bit 31 set to zero.As to why this is done, your guess is as good as mine. It could have been that the original developer felt that only dealing with positive integers would simplify things down the line. For all we know, it could even be an off-by-one error in the
hash_bitscomputation. 😉