I have read about hashtable and open adrdessing. If you want to insert the keys: 18,32,44 in an hashtable with size 13:
18 gets index 5 (18 modulus 13 = 5)
32 gets index 6 (32 modulus 13 = 6)
44 gets index 5 (44 modulus 13 = 5)
You’ll get a collision because there are already something on index 5.
If you use linear probing you’ll do hashfunction = (key+i) modulus N where i = 0,1,2.. until you find an empty place in the hashtable. Then 44 will get be inserted at index 7.
What if you delete 32, and then you want to delete 44. You start by looking at hashfunction(44)=5 – that was not 44, then hashfunction(44 + 1) = 6 – that is empty. Then you might think that 44 is gone. How do you mark a place in the hashtable, that the place is not really empty, but does not contain a key, and that you should keep looking for 44 at the next index?
If you then need to insert another key at index 6 then the key just overwrites the “mark” in the hashtable.
What could you use to mark an index – saying here has been an key, but has been deleted – so you continue to look at next index? You can’t just write null or 0 because then either you think the key has been deleted (null) or that an key with value 0 has overwritten 44.
One way to handle hash tables using open addressing is to use state marks:
EMPTY,OCCUPIEDandDELETED. Note that there’s an important distinction betweenEMPTY, which means the position has never been used andDELETED, which means it was used but got deleted.When a value gets removed, the slot is marked as
DELETED, notEMPTY. When you try to retrieve a value, you’ll probe until you find a slot that’s markEMPTY; eg: you considerDELETEDslots to be the same asOCCUPIED. Note that insertion can ignore this distinction – you can insert into aDELETEDorEMPTYslot.The question is tagged Java, which is a bit misleading because Java (or at least Oracle’s implementation of it) does not use open addressing. Open addressing gets specially problematic when the load factor gets high, which causes hash collisions to occur much more often:
As you can see, there’s a dramatic performance drop near the 0.7 mark. Most hashtables get resized once their load factor gets past a certain constant factor. Java for example doubles the size of its
HashMapwhen the load factor gets past 0.75.