I do not know much about hashcodes. I found this code which prints the collisions.
Can you please tell me what are collisions and how to reduce it?
Why should we use hashcodes?
public static int getHash(String str, int limit)
{
int hashCode = Math.abs(str.hashCode()%(limit));
return hashCode;
}
/**
* @param args
*/
public static void main(String[] args)
{
int hashLimit = 10000;
int stringsLimit = 10000;
String[] arr = new String[hashLimit];
List<String> test = new ArrayList<String>();
Random r = new Random(2);
for ( int i = 0 ; i < stringsLimit ; i++ )
{
StringBuffer buf = new StringBuffer("");
for ( int j = 0 ; j < 10 ; j++ )
{
char c = (char)(35+60*r.nextDouble());
buf.append(c);
}
test.add(buf.toString());
//System.out.println(buf.toString());
}
int collisions = 0;
for ( String curStr : test )
{
int hashCode = getHash(curStr,hashLimit);
if ( arr[hashCode] != null && !arr[hashCode].equals(curStr) )
{
System.out.println("collision of ["+arr[hashCode]+"] ("+arr[hashCode].hashCode()+" = "+hashCode+") with ["+curStr+"] ("+curStr.hashCode()+" = "+hashCode+")");
collisions++;
}
else
{
arr[hashCode] = curStr;
}
}
System.out.println("Collisions: "+collisions);
}
Collisions are when two non-equal objects have the same hash code. They’re a fact of life – you need to deal with it.
Because they make it quick to look up values by key, basically. A hash table can use a hash code to very quickly get the set of possible key matches down to a very small set (often just one), at which point you need to check for actual key equality.
You should never assume that two hash codes being equal means the objects they were derived from are equal. Only the reverse is true: assuming a correct implementation, if two objects give different hash codes, then they are not equal.