I’m implementing some program which uses id’s with variable length. These id’s identify a message and are sent to a broker which will perform some operation (not relevant to the question). However, the maximum length for this id in the broker is 24 bytes. I was thinking about hashing the id (prior to sending to the broker) with SHA and removing some bytes until it gets 24 bytes only.
However, I want to have an idea of how much will this increase the collisions. So this is what I got until now:
I found out that for a “perfect” hash we have the formula p^2 / 2^n+1 to describe the probability of collisions and where p is the number of messages and n is the size of the message in bits. Here is where my problem starts. I’m assuming that removing some bytes from the final hash the function still remains “perfect” and I can still use the same formula. So assuming this I get:
5160^2 / 2^192 + 1 = 2.12x10^-51
Where 5160 is the pick number of messages and 192 is basically the number of bits in 24 bytes.
My questions:
-
Is my assumption correct? Does the hash stay “perfect” by removing some bytes.
-
If so and since the probability is really small, which bytes should I remove? Most or less significant? Does it really matter at all?
PS: Any other suggestion to achieve the same result is welcomed. Thanks.
SHA-1 outputs only 20 bytes (160 bits), so you’d need to pad it. At least if all bytes are valid, and you’re not restricted to hex or Base64. I recommend using truncated SHA-2 instead.
Pretty much. Truncating hashes should conserve all their important properties, obviously at the reduced security level corresponding to the smaller output size.
That should not matter at all. NIST defined a truncated SHA-2 variant, called SHA-224, which takes the first 28 bytes of SHA-256 using a different initial state for the hash calculation.
My recommendation is to use SHA-256, keeping the first 24 bytes. This requires around 2^96 hash-function calls to find one collision. Which is currently infeasible, even for extremely powerful attackers, and essentially impossible for accidental collisions.