The JDK documentation for java.lang.String.hashCode() famously says:
The hash code for a String object is computed as
s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]using
intarithmetic, wheres[i]is the *i*th character of the string,nis the length of the string, and^indicates exponentiation.
The standard implementation of this expression is:
int hash = 0;
for (int i = 0; i < length; i++)
{
hash = 31*hash + value[i];
}
return hash;
Looking at this makes me feel like I was sleeping through my algorithms course. How does that mathematical expression translate into the code above?
I’m not sure if you missed where it says “^ indicates exponentiation” (not xor) in that documentation.
Each time through the loop, the previous value of hash is multipled by 31 again before being added to the next element of
value.One could prove these things are equal by induction, but I think an example might be more
clear:
Say we’re dealing with a 4-char string. Let’s unroll the loop:
Now combine these into one statement by substituting each value of hash into the following statement:
31 * 0 is 0, so simplify:
Now multiply the two inner terms by that second 31:
Now multiply the three inner terms by that first 31:
and convert to exponents (not really Java anymore):