I do not know in advance how many elements are going to be stored in my Hashmap . So how big should the capacity of my HashMap be ? What factors should I take into consideration here ? I want to minimize the rehashing process as much as possible since it is really expensive.
I do not know in advance how many elements are going to be stored
Share
You want to have a good tradeoff between space requirement and speed (which is reduced if many collisions happen, which becomes more likely if you reduce the space allocation).
You can define a load factor, the default is probably fine.
But what you also want to avoid is having to rebuild and extend the hash table as it grows. So you want to size it with the maximum capacity up front. Unfortunately, for that, you need to know roughly how much you are going to put into it.
If you can afford to waste a little memory, and at least have a reasonable upper bound for how large it can get, you can use that as the initial capacity. It will never rehash if you stay below that capacity. The memory requirement is linear to the capacity (maybe someone has numbers). Keep in mind that with a default load factor of 0.75, you need to set your capacity a bit higher than the number of elements, as it will extend the table when it is already 75% full.
If you really have no idea, just use the defaults. Not because they are perfect in your case, but because you don’t have any basis for alternative settings.
The good news is that even if you set suboptimal values, it will still work fine, just waste a bit of memory and/or CPU cycles.