I’m looking for a data structure that can possibly outperform Dictionary<string, object>. I have a map that has N items – the map is constructed once and then read many, many times. The map doesn’t change during the lifetime of the program (no new items are added, no items are deleted and items are not reordered). Because the map doesn’t change, it doesn’t need to be thread-safe, even though the application using it is heavily multi-threaded. I expect that ~50% of lookups will happen for items not in the map.
Dictionary<TKey, TItem> is quite fast and I may end up using it but I wonder if there’s another data structure that’s faster for this scenario. While the rest of the program is obviously more expensive than this map, it is used in performance-critical parts and I’d like to speed it up as much as possible.
What you’re looking for is a Perfect Hash Function. You can create one based on your list of strings, and then use it for the Dictionary.
The non-generic
HashTablehas a constructor that acceptsIHashCodeProviderthat lets you specify your own hash function. I couldn’t find an equivalent forDictionary, so you might have to resort to using a Hashtable instead.You can use it internally in your
PerfectStringHashclass, which will do all the type casting for you.Note that you may need to be able to specify the number of buckets in the hash. I think
HashTableonly lets you specify the load factor. You may find out you need to roll your own hash entirely. It’s a good class for everyone to use, I guess, a generic perfect hash.EDIT: Apparantly someone already implemented some Perfect Hash algorithms in C#.