I need to count a lot of different items. I’m processing a list of pairs such as:
A34223,34
B23423,-23
23423212,16
What I was planning to do was hash the first value (the key) into a 32bit integer which will then be a key to a sparse structure where the ‘value’ will be added (all start at zero) number and be negative.
Given that they keys are short and alphanumeric, is there an way to generate a hash algorithm that is fast on 32bit x86 architectures? Or is there an existing suitable hash?
I don’t know anything about the design of hashes, but was hoping that due to the simple input, there would be a way of generating a high performance hash that guarantees no collision for a given key length of “X” and has high dispersion so minimizes collisions when length exceeds “X”.
As you are using C++, the first thing you should do is to create a trivial implimentation using std::map. Is it fast enough (it probably will be)? If so, stick with it, otherwise investigate if your C++ implementation provides a hash table. If it does, use it to create a trivial implementation, test, time it. Is it fast enough (almost certainly yes)?
Only after you hav eexhausted these options should you think of implementing your own hash table and hashing function.