Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8493721
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 10, 20262026-06-10T23:05:58+00:00 2026-06-10T23:05:58+00:00

My problem is this. I have a block of data. Occasionally this block of

  • 0

My problem is this. I have a block of data. Occasionally this block of data is updated and a new changed version appears. I need to detect if the data I am looking at matches the version I am expecting to receive.

I have decided to use a fingerprint so that I can avoid storing the ‘expected’ version of the data in full. It seems that the ‘default’ choice for this kind of thing is an MD5 hash.

However MD5 was designed to be cryptographically secure. There are much faster hashing functions. I am looking at modern non-cryptographic functions such as CityHash and SpookyHash.

Since I control all the data in my system I only care about accidental collisions where a changed block of data hashes to the same value. Therefore I don’t think I have to worry about the ‘attacker-proof’ nature of cryptographic hashes and could get away with a simpler hash function.

Are there any problems with using a hash function such as CityHash or SpookyHash for this purpose, or should I just stick with MD5? Or should I be using something specifically designed for fingerprinting such as a Rabin fingerprint?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-10T23:05:59+00:00Added an answer on June 10, 2026 at 11:05 pm

    Yes, it’s okay (also take a look at the even faster CRC series of functions). However I tend to avoid using hashes to differentiate data, using serial numbers combined with a date/time value provide a means to determine which version is newer and to detect out-of-sync changes. Fingerprints are used more to detect corrupted files rather than versioning.

    If you want to compare one set of data with another, then don’t use hashes/fingerprints, just compare the data directly. It’s faster to compare two streams than it is to take the hashes of two streams and then compare the hashes.

    That said, a good quick way to compare lots of files is to take the hashes of each file, then compare the hashes, and when there’s a hash match you then compare the raw bytes. The chance of a hash collision is indeed minimal, but it isn’t impossible – and I like to absolutely be sure.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have this problem: I want to generate a new source code file from
I have this problem, when I load my form, the validation messages appears in
I have a problem with this very simple block of code. please give me
I have this block of code: users = Array.new users << User.find(:all, :conditions =>
I am new to TPerlRegEx. Have a problem to match data from the data
I have this pattern written ^.*\.(?!jpg$|png$).+$ However there is a problem - this pattern
I'm developing an Android application with a spinner. I have this problem: This is
I have this problem: $id is id for each user $img = 'http://www.somesite.com/pictures/'; $no_img
I have this problem since I'm beginning in OOP programming I want to close
I have this problem when I run SAS 9.2 on the command line on

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.