I’ve written a custom hashing for my custom key in stdext::hash_map and would like to check whether the hasher is good. I’m using STL supplied with VS 2008. A typical check, as I know, is to check the uniformity of distribution among buckets.
How should I organize such a check correctly? A solution that comes to my mind is to modify STL sources to add a method to hash_map that walks through buckets and does the subject. Is there are any better ways?
Maybe, derive from hash_map and create there such method?
I’d run one (large) dataset through stl::hash_map. Once done, I’d collect the results for all buckets using the following method
From
hash_map:Finally, I would do compute the standard deviation (SD) of the elem-to-bucket distribution.
I’d do the above for different hash functions. Whichever hash function results in minimum SD is the winner (for this dataset).