From perldata:
You can preallocate space for a hash by assigning to the keys() function.
This rounds up the allocated buckets to the next power of two:
keys(%users) = 1000; # allocate 1024 buckets
Is there a rule of thumb for when presizing a hash will improve performance?
The rule of thumb is that the larger you know the Hash will be, the more likely you’ll get value out of pre-sizing it. Consider if your hash has 10 slots, and you start adding one after the other, the number of expansions will a) be few (if at all), and b) small (since there is little data).
But if you KNOW you’re going to need at least 1M items, then there’s no reason to expand, and copy the underlying and ever expanding data structures over and over while the table grows.
Will you NOTICE this expansion? Eh, maybe. Modern machines are pretty darn fast, it may not come up. But it’s a grand opportunity for heap expansion, thus causing a GC and a cascade of all sorts of things. So, if you know you’re going to use it, it’s a “cheap” fix to tweak out a few more milibleems of performance.