Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9254321
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 18, 20262026-06-18T11:26:19+00:00 2026-06-18T11:26:19+00:00

How do I get a short hash of a long string using Excel VBA?

  • 0

How do I get a short hash of a long string using Excel VBA?

What’s given

  • Input string is not longer than 80 characters
  • Valid input characters are: [0..9] [A_Z] . _ /
  • Valid output characters are [0..9] [A_Z] [a_z] (lower and upper case can be used)
  • The output hash shouldn’t be longer than ~12 characters (shorter is even better)
  • No need to be unique at all since this will result in a hash that’s too long

What I have done so far

I thought this SO answer is a good start since it generates a 4-digit Hex-Code (CRC16).

But 4 digits were too few. In my test with 400 strings, 20% got a duplicate somewhere else.
The chance to generate a collision is too high.

Sub tester()
    For i = 2 To 433
        Cells(i, 2) = CRC16(Cells(i, 1))
    Next i
End Sub


Function CRC16(txt As String)
Dim x As Long
Dim mask, i, j, nC, Crc As Integer
Dim c As String

Crc = &HFFFF

For nC = 1 To Len(txt)
    j = Val("&H" + Mid(txt, nC, 2))
    Crc = Crc Xor j
    For j = 1 To 8
        mask = 0
        If Crc / 2 <> Int(Crc / 2) Then mask = &HA001
        Crc = Int(Crc / 2) And &H7FFF: Crc = Crc Xor mask
    Next j
Next nC

CRC16 = Hex$(Crc)
End Function

How to reproduce

You can copy these 400 test strings from pastebin.
Paste them to column A in a new Excel workbook and execute the code above.

Q: How do I get a string hash which is short enough (12 chars) and long enough to get a small percentage of duplicates.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-18T11:26:20+00:00Added an answer on June 18, 2026 at 11:26 am

    Split your string into three shorter strings (if not divisible by three, the last one will be longer than the other two). Run your “short” algorithm on each, and concatenate the results.

    I could write the code but based on the quality of the question I think you can take it from here!

    EDIT: It turns out that that advice is not enough. There is a serious flaw in your original CRC16 code – namely the line that says:

    j = Val("&H" + Mid(txt, nC, 2))
    

    This only handles text that can be interpreted as hex values: lowercase and uppercase letters are the same, and anything after F in the alphabet is ignored (as far as I can tell). That anything good comes out at all is a miracle. If you replace the line with

    j = asc(mid(txt, nC, 1))
    

    Things work better – every ASCII code at least starts out life as its own value.

    Combining this change with the proposal I made earlier, you get the following code:

    Function hash12(s As String)
    ' create a 12 character hash from string s
    
    Dim l As Integer, l3 As Integer
    Dim s1 As String, s2 As String, s3 As String
    
    l = Len(s)
    l3 = Int(l / 3)
    s1 = Mid(s, 1, l3)      ' first part
    s2 = Mid(s, l3 + 1, l3) ' middle part
    s3 = Mid(s, 2 * l3 + 1) ' the rest of the string...
    
    hash12 = hash4(s1) + hash4(s2) + hash4(s3)
    
    End Function
    
    Function hash4(txt)
    ' copied from the example
    Dim x As Long
    Dim mask, i, j, nC, crc As Integer
    Dim c As String
    
    crc = &HFFFF
    
    For nC = 1 To Len(txt)
        j = Asc(Mid(txt, nC)) ' <<<<<<< new line of code - makes all the difference
        ' instead of j = Val("&H" + Mid(txt, nC, 2))
        crc = crc Xor j
        For j = 1 To 8
            mask = 0
            If crc / 2 <> Int(crc / 2) Then mask = &HA001
            crc = Int(crc / 2) And &H7FFF: crc = crc Xor mask
        Next j
    Next nC
    
    c = Hex$(crc)
    
    ' <<<<< new section: make sure returned string is always 4 characters long >>>>>
    ' pad to always have length 4:
    While Len(c) < 4
      c = "0" & c
    Wend
    
    hash4 = c
    
    End Function
    

    You can place this code in your spreadsheet as =hash12("A2") etc. For fun, you can also use the “new, improved” hash4 algorithm, and see how they compare. I created a pivot table to count collisions – there were none for the hash12 algorithm, and only 3 for the hash4. I’m sure you can figure out how to create hash8, … from this. The “no need to be unique” from your question suggests that maybe the “improved” hash4 is all you need.

    In principle, a four character hex should have 64k unique values – so the chance of two random strings having the same hash would be 1 in 64k. When you have 400 strings, there are 400 x 399 / 2 “possible collision pairs” ~ 80k opportunities (assuming you had highly random strings). Observing three collisions in the sample dataset is therefore not an unreasonable score. As your number of strings N goes up, the probability of collisions goes as the square of N. With the extra 32 bits of information in the hash12, you expect to see collisions when N > 20 M or so (handwaving, in-my-head-math).

    You can make the hash12 code a little bit more compact, obviously – and it should be easy to see how to extend it to any length.

    Oh – and one last thing. If you have RC addressing enabled, using =CRC16("string") as a spreadsheet formula gives a hard-to-track #REF error… which is why I renamed it hash4

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Long story short, the database I'm using needs to get looked at. Until that
I using a web services to get the value, make the long story short.
I'm trying to get get the short filename from a long filename but I'm
I'm using the GoogleMaps API to get the lat and long for an address,
Long story short, im making a small website using ajax but its more for
So long story short im working on a web app and using AJAX within
Is there a cleaner way to get the short version hash of HEAD from
i want to convert a short string to md5 hash , I found several
my question is , I want to get some short and smart ideas to
Any short code to get the database table prefix in cakephp controller action ?

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.