Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8450023
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T10:51:37+00:00 2026-06-10T10:51:37+00:00

I have been reading about hashcode functions for the past couple of hours and

  • 0

I have been reading about hashcode functions for the past couple of hours and have accumulated a couple of questions regarding use of prime numbers as multipliers in custom hashcode implementations. I would be grateful if I could get some insight regarding following questions:

  • In a comment to @mattb’s answer here, @hstoerr advocates for use of larger primes (such as 524287) instead of the common prime 31. My question is, given the following implementation of a hashcode functions for a pair or elements:

    @Override
    public int hashCode() {
        final int prime = 31;
        int hash1 = (pg1 == null) ? 0 : pg1.hashCode();
        int hash2 = (pg2 == null) ? 0 : pg2.hashCode();
        return prime * (hash1 ^ hash2);
    }
    

doesn’t this lead to an overflow on the returned int if prime is a large number?

  • Assuming that the overflow is not a problem (JVM doing an automatic cast) is it better to do a bitshift instead of a cast?

  • I imagine the performance of the hashcode function vary significantly based on the complexity of the hashcode. Does the size of the prime multiplier not effect the performance?

  • Is it better/smarter/faster to use multiple primes in a custom hashcode function instead of a single multiplier? If not, is there some other advantage? See the example below from @jinguy’s answer to a relevant question:

    public int hashCode() {
        return a * 13 + b.hashCode() * 23 + (c? 31: 7);
    }
    

where a is an int, b is a String and c is boolean.

  • How about something like long lhash = prime * (hash1 ^ hash2); then using (int)((lhash >> 32) ^ lhash)? That’s something I saw on another question here SO, but it wasn’t really explained why it was a good idea to do it like that.
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T10:51:39+00:00Added an answer on June 10, 2026 at 10:51 am

    Apologies in advance for the novel. Feel free to make suggestions or edit directly. –Chet

    There is an overflow, but not an exception.

    The danger doesn’t come from losing accuracy, but losing range. Let’s use a ridiculous example, where “prime” is a large power of 2, and 8-bit unsigned numbers for brevity. And assume that (hash1 ^ hash2) is 255:

            "prime": 1000 0000
    (hash1 ^ hash2): 1111 1111
    

    Showing the truncated digits in brackets, our result is:

            product: [0111 1111] 1000 0000
    

    But multiplying by 128 is the same as shifting left by 7 places. So we know that whatever the value of (hash1 ^ hash2), the least-significant places of the product will have seven zeros. So if (hash1 ^ hash2) is odd (least significant bit = 1), then the result of multiplying by 128 will always be 128 (after truncating the higher digits). And if (hash1 ^ hash2) is even (LSB is 0, then the product will always be zero.

    This extends to larger bit sizes. The general point is that if the lower bits of “prime” are zeros, you’re doing a shift (or multiple shift + sum) operation that will give you zeros in the lower bits. And the range of the product of multiplication will suffer.

    But let’s try making “prime” odd, so that the least significant bit will always be 1. Think about decomposing this into shift / add operations. The unshifted value of (hash1 ^ hash2) will always be one of the summands. The least significant bits that were shifted into guaranteed uselessness by an even “prime” multiplier will now be set based on, at minimum, the bits from the original (hash1 ^ hash2) value.

    Now, let’s consider a value of prime which is actually prime. If it’s more than 2, then we know it’s odd. So the lower bits haven’t been shifted into uselessness. And by choosing a sufficiently large prime, you get better distribution across the range of output values than you’d get with a smaller prime.

    Try some exercises with 16-bit multiplication using 8443 (0010 0000 1111 1011) and 59 (0000 0000 0011 1011). They’re both prime, and the lower bits of 59 match the lower bits of 65531. For example, if hash1 and hash2 are both ASCII character values (0 .. 255), then all of the results of (hash1 ^ hash2) * 59 will be <= 15045. This means that roughly 1/4 of the range of hash values (0..65535) for a 16-bit number go unused.

    But (hash1 ^ hash2) * 8443 is all over the map. It overflows if (hash1 ^ hash2) is as low as 8. It uses all 16 bits even for very small input numbers. There’s much less clustering of hash values across the overall range, even if the input numbers are in a relatively small range.

    Assuming that the overflow is not a problem (JVM doing an automatic cast) is it better to do a bitshift instead of a cast?

    Most likely not. The JVM should translate into an efficient implementation on the host processor anyway. Integer multiplication should be implemented in hardware. And if not, the JVM is responsible for translating the operation into something reasonable for the CPU. It’s very likely that the case of integer multiplication is highly optimized already. If integer multiplication is done more quickly on a given CPU as shift-and-add, the JVM should implement it that way. But it’s less likely that the folks writing the JVM would care to watch for cases where multiple shift-and-add operations could have been combined into a single integer multiply.

    I imagine the performance of the hashcode function vary significantly based on the complexity of the hashcode. Does the size
    of the prime multiplier not effect the performance?

    No. The operations are the same when done in hardware regardless of the size, number of bits set, etc. It’s probably a couple of clock cycles. It would vary depending on the specific CPU, but should be a constant-time operation regardless of the input values.

    Is it better/smarter/faster to use multiple primes in a custom hashcode function instead of a single multiplier? If not, is there
    some other advantage?

    Only if it reduces the possibility of collisions, and this depends on the numbers you’re using. If your hash code depends on A and B and they’re in the same range, you might consider using different primes or shifting one of the input values to reduce overlap between the bits. Since you’re depending on their individual hash codes, and not their values directly, it’s reasonable to assume that their hash codes provide good distribution, etc.

    One factor that comes to mind whether you want the hash code for (x, y) to be different from (y, x). If your hash function treats A and B in the same way, then hash(x, y) = hash(y, x). If that’s what you want, then by all means use the same multiplier. It not, using a different multiplier would make sense.

    How about something like long lhash = prime * (hash1 ^ hash2); then using (int)((lhash >> 32) ^ lhash)? That’s something I saw on another question here SO, but it wasn’t really explained why it was a good idea to do it like that.

    Interesting question. In Java, longs are 64-bit and ints are 32-bit. So this generates a hash using twice as many bits as desired, and then derives the result from the high and low bits combined.

    If multiplying a number n by a prime p, and the lowermost k bits of n are all zeros, then the lowermost k bits of the product n * p will also be all zeros. This is fairly easy to see — if you’re multiplying, say, n = 0011 0000 and p = 0011 1011, then the product can be expressed as the sum of two shift operations. Or,

    00110000 * p = 00100000 * p + 00010000 * p
                 = p << 5 + p << 4
    

    Taking p = 59 and using unsigned 8-bit ints and 16-bit longs, here are some examples.

     64: 0011 1011 * 0100 0000 = [ 0000 1110 ] 1100 0000 (192)
    128: 0011 1011 * 1000 0000 = [ 0001 1101 ] 1000 0000 (128)
    192: 0011 1011 * 1100 0000 = [ 0010 1100 ] 0100 0000 (64)
    

    By just dropping the high bits of the result, the range of the resulting hash value is limited when the low bits of the non-prime multiplicand are all zeros. Whether that’s an issue in a specific context is, well, context-specific. But for a general hash function it’s a good idea to avoid limiting the range of output values even when there are patterns in the input numbers. And in security applications, it’s even more critical to avoid anything that would let someone make inferences about the original value based on patterns in the output. Just taking the low bits reveals the exact values of some of the original bits. If we make the assumption that the operation involved multiplying an input number with a large prime, then we know that the original number had as many zeros at the right as the hash output (because the prime’s rightmost bit was 1).

    By XORing the high bits with the low bits, there’s less consistency in the output. And more importantly, it’s much harder to make guesses about the input values based on this information. Based on how XOR works, it could mean the original low bit was 0 and the high bit was 1, or the original low bit was 1 and the high bit was 0.

     64: 0011 1011 * 0100 0000 = 0000 1110 1100 0000 => 1100 1110 (206)
    128: 0011 1011 * 1000 0000 = 0001 1101 1000 0000 => 1001 1101 (157)
    192: 0011 1011 * 1100 0000 = 0010 1100 0100 0000 => 0110 1100 (204)
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have been reading about entity framework over the past couple of days and
I have been reading about MVP, MVVM and have used asp.net MVC in couple
I have been reading about streams in Java the past days. After reading quite
I have been reading about methods and functions in Scala. Jim's post and Daniel's
Have been reading about async and tasks and been attempting to convert the CopyFileEx
I have been reading about making ajax heavy applications more search engine friendly: https://developers.google.com/webmasters/ajax-crawling/docs/getting-started
I have been reading about collision detection in games on stackoverflow and other sites.
I have been reading about the continuation passing style programming technique (C# 3.0 implementation).
I have been reading about Quicksort and found that sometimes it' s referred to
I have been reading about Scala for a while and even wrote some small

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.