Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7419083
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 29, 20262026-05-29T07:59:55+00:00 2026-05-29T07:59:55+00:00

We have an application that Generates a hash code on a string Saves that

  • 0

We have an application that

  • Generates a hash code on a string
  • Saves that hash code into a DB along with associated data
  • Later, it queries the DB using the string hash code for retrieving the data

This is obviously a bug because the value returned from string.GetHashCode() varies from .NET versions and architectures (32/64 bit). To complicate matters, we’re too close to a release to refactor our application to stop serializing hash codes and just query on the strings instead. What we’d like to do is come up with a quick and dirty fix for now, and refactor the code later to do it the right way.

The quick and dirty fix seems like creating a static GetInvariantHashCode(string s) helper method that is consistent across architectures.

Can suggest an algorithm for generating a hashcode on a string that is equivalent on 32 bit and 64 bit architecture?

A few more notes:

  • I’m aware that HashCodes are not unique. If a hashcode returns a match on two different strings, we post process the results to find the exact match. It is not used as a primary key.
  • I believe the architect’s intent was to speed up the searches by querying on a long instead of an NVarChar
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-29T07:59:55+00:00Added an answer on May 29, 2026 at 7:59 am

    I’m aware that HashCodes are not unique. If a hashcode returns a match on two different strings, we post process the results to find the exact match. It is not used as a primary key.

    I believe the architect’s intent was to speed up the searches by querying on a long instead of an NVarChar

    Then just let the database index the strings for you!

    Look, I have no idea how large your domain is, but you’re going to get collisions very rapidly with very high likelihood if it’s of any decent size at all. It’s the birthday problem with a lot of people relative to the number of birthdays. You’re going to have collisions, and lose any gain in speed you might think you’re gaining by not just indexing the strings in the first place.

    Anyway, you don’t need us if you’re stuck a few days away from release and you really need an invariant hash code across platform. There are really dumb, really fast implementations of hash code out there that you can use. Hell, you could come up with one yourself in the blink of an eye:

    string s = "Hello, world!";
    int hash = 17;
    foreach(char c in s) {
        unchecked { hash = hash * 23 + c.GetHashCode(); } 
    }
    

    Or you could use the old Bernstein hash. And on and on. Are they going to give you the performance gain you’re looking for? I don’t know, they weren’t meant to be used for this purpose. They were meant to be used for balancing hash tables. You’re not balancing a hash table. You’re using the wrong concept.

    Edit (the below was written before the question was edited with new salient information):

    You can’t do this, at all, theoretically, without some kind of restriction on your input space. Your problem is far more severe than String.GetHashCode differening from platform to platform.

    There are a lot of instances of string. In fact, way more instances than there are instances of Int32. So, because of the piegonhole principle, you will have collisions. You can’t avoid this: your strings are pigeons and your Int32 hash codes are piegonholes and there are too many pigeons to go in the pigeonholes without some pigeonhole getting more than one pigeon. Because of collision problems, you can’t use hash codes as unique keys for strings. It doesn’t work. Period.

    The only way you can make your current proposed design work (using Int32 as an identifier for instances of string) is if you restrict your input space of strings to something that has at size less than or equal to the number of Int32s. Even then, you’ll have difficulty coming up with an algorithm that maps your input space of strings to Int32 in a unique way.

    Even if you try to increase the number of pigeonholes by using SHA-512 or whatever, you still have the possibility of collisions. I doubt you considered that possibility previously in your design; this design path is DOA. And that’s not what SHA-512 is for anyway, it’s not to be used for unique identification of messages. It’s just to reduce the likelihood of message forgery.

    To complicate matters, we’re too close to a release to refactor our application to stop serializing hash codes and just query on the strings instead.

    Well, then you have a tremendous amount of work ahead of you. I’m sorry you discovered this so late in the game.

    I note the documentation for String.GetHashCode:

    The behavior of GetHashCode is dependent on its implementation, which might change from one version of the common language runtime to another. A reason why this might happen is to improve the performance of GetHashCode.

    And from Object.GetHashCode:

    The GetHashCode method is suitable for use in hashing algorithms and data structures such as a hash table.

    Hash codes are for balancing hash tables. They are not for identifying objects. You could have caught this sooner if you had used the concept for what it is meant to be used for.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

We have an application that generates simulated data for one of our services for
I'm using C# and i have written a locally installed application that dynamically generates
I have developed a VB.NET application that generates code in a specific format. It
I have a application that generates a couple of different mails. These mails are
I have an application that generates around 10000 printed pages per month. Each report
Background: we have an application that generates reports from HTML (that may or may
I have a Winforms application that generates its own PrintDocument object for printing. It
I have a simple c++ application that generates reports on the back end of
I have an application that takes some input and generates configuration files as output.
I have a Java program that generates Java classes for my application. Basically it

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.