I am trying to understand what issues we can face if we implement hashCode() method wrongly.
For example, I tried to create a sample class HashEx which statically returns same hash value (100) for all the instances of the class and then tried to use HashEx in HashSet/HashMap with different operations:
HashSet -> add,read,contains
HashMap -> put,get
So far, all the operations seem to be working good. Any thoughts on this wild idea? I am trying to understand where this wrong implementation of hashCode() will create issues?
public class HashEx {
public int id;
public String name;
public static void main(String[] args){
HashEx e1 = new HashEx();
e1.id=1;
e1.name="Tom";
HashEx e2 = new HashEx();
e2.id=2;
e2.name="Jerry";
// set
HashSet<HashEx> myset = new HashSet<HashEx>();
myset.add(e1);
myset.add(e2);
System.out.println("Set size : "+ myset.size());
for(HashEx e : myset){
System.out.println("id: " + e.id + ", name: " + e.name);
}
HashEx e4 = new HashEx();
e4.id = 2;
e4.name = "Jerry";
System.out.println("myset.contains(e4) : " + myset.contains(e4));
// map
HashMap<HashEx, String> map = new HashMap<HashEx, String>();
map.put(e1, "Tom");
map.put(e2, "Jerry");
System.out.println("Map size : "+ map.size());
System.out.println(map.get(e1));
System.out.println(map.get(e2));
}
@Override
public boolean equals(Object obj) {
if(((HashEx)obj).id != id)
return false;
if(!((HashEx)obj).name.equals(name))
return false;
return true;
}
@Override
public int hashCode() {
return 100;
}
}
Everything will work properly (so long as you correctly implement
equals(Object)in yourHashExclass) in the sense that you won’t see any incorrect behavior.But when you get a large number of those objects in your
HashSet(or as keys in aHashMap), then you will start seeing very poor performance. The objects are put into buckets based on theirhashCode, and all the objects in the same bucket have to be searched linearly whenever one of the collection operations is done.So a better test to demonstrate the problem would be write a loop that just starts adding more and more objects (until the program runs out of memory or you kill it) and print out a status message every 10,000 objects. You’ll see that the add operation gets slower and slower (quadratically).
If the objects instead have a different
hashCodethen the operation will not slow down (much) at all and it’ll run out of memory much faster.