For an assignment I have to write the code for a generic Hash Table. In an example Put method, there are two lines:
int hash = key.hashCode(); // get the hashcode of the key
int index = compress(hash); // compress it to an index
I was of the understanding that the hashCode method used the key to return an index, and you would place the key/value pair in the array at that index. But here we “compress” the hash code to get the index. What does this method do? How does it “compress” the hash code? Is it necessary and/or preferred?
The hash code can be any integer between -231 and and 231-1. That’s ~4 billion different possible hash codes. If you have, say, 40 hash table buckets, you need to “compress” those 4 billion integers down to the range 0-39.
A common way to do this is with the modulus operator
%.a % breturns the remainder after dividingabyb. For example,7 % 3 == 1.Note: This isn’t true in all languages, but in Java the sign of the result equals the sign of the dividend, meaning the result of our calculation above will always be non-negative. In C and C++ this is not the case (the sign is implementation defined), and so one would need to take special care to handle negative hash values correctly.
See integer modulo operators in various programming languages on Wikipedia for a breakdown of how each programming language handles modulus’s sign.