I got a string of an arbitrary length (lets say 5 to 2000 characters)

Question

0

Editorial Team

Asked: May 31, 20262026-05-31T20:29:35+00:00 2026-05-31T20:29:35+00:00

I got a string of an arbitrary length (lets say 5 to 2000 characters)

0

I got a string of an arbitrary length (lets say 5 to 2000 characters) which I would like to calculate a checksum for.

Requirements

The same checksum must be returned each time a calculation is done for a string
The checksum must be unique (no collisions)
I can not store previous IDs to check for collisions

Which algorithm should I use?

Update:

Are there an approach which is reasonable unique? i.e. the likelihood of a collision is very small.
The checksum should be alphanumeric
The strings are unicode
The strings are actually texts that should be translated and the checksum is stored with each translation (so a translated text can be matched back to the original text).
The length of the checksum is not important for me (the shorter, the better)

Update2

Let’s say that I got the following string "Welcome to this website. Navigate using the flashy but useless menu above".

The string is used in a view in a similar way to gettext in linux. i.e. the user just writes (in a razor view)

@T("Welcome to this website. Navigate using the flashy but useless menu above")

Now I need a way to identity that string so that I can fetch it from a data source (there are several implementations of the data source). Having to use the entire string as a key seems a bit inefficient and I’m therefore looking for a way to generate a key out of it.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T20:29:36+00:00

That’s not possible.

If you can’t store previous values, it’s not possible to create a unique checksum that is smaller than the information in the string.

Update:

The term "reasonably unique" doesn’t make sense, either it’s unique or it’s not.

To get a reasonably low risk of hash collisions, you can use a resonably large hash code.

The MD5 algorithm for example produces a 16 byte hash code. Convert the string to a byte array using some encoding that preserves all characters, for example UTF-8, calculate the hash code using the MD5 class, then convert the hash code byte array into a string using the BitConverter class:

string theString = "asdf";

string hash;
using (System.Security.Cryptography.MD5 md5 = System.Security.Cryptography.MD5.Create()) {
  hash = BitConverter.ToString(
    md5.ComputeHash(Encoding.UTF8.GetBytes(theString))
  ).Replace("-", String.Empty);
}

Console.WriteLine(hash);

Output:

912EC803B2CE49E4A541068D495AB570

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I got a string of an arbitrary length (lets say 5 to 2000 characters)

Leave an answerCancel reply

1 Answer

Update:

Leave an answer
Cancel reply