I have a colleague struggling with a hashing problem.
There is a 17-alphanumeric valued key (a VIN code) that needs to be converted into a 4 byte value (could be alphanumeric as well). Knowing that those 4 bytes will limit the number of keys, what perfect-hash algorithm would you see for this problem?
After a quick look at Wikipedia, I think you could first “compress” the key, or in other word, you do hash in 2 stages.
Stage 1: break down the key to individual parts according to the standard, and do customized hash independently.
Stage 2: Get hashes together, and do a normal hash.
An naive example:
If your data is limited to United States, there is only 27 possibility of the first 2 bytes, so the first 2 bytes can be hashed to 0 – 26. (Suppose we get
ahere.)Then suppose other bytes have N possibilities, and can be hashed to 0 – N-1. (Suppose we get
bhere.)The combinational result can be
a * N + b. Then do a normal hash (if 26 * N > what 4 bytes can express).