Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6558171
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T13:10:00+00:00 2026-05-25T13:10:00+00:00

Note : I’m not trying to use SuperFastHash and expecting it to give the

  • 0

Note : I’m not trying to use SuperFastHash and expecting it to give the same output values as CRC32.

I’m writing a simple LZSS compression/decompression routine to provide very fast decompression and no memory overhead when decompressing. Input data is split into blocks of 4096-bytes length, and compressed sequentially.

My problem : I want to add some error detection for each compressed block (block size <= 4096 bytes). The time constraint is drastic, and therefore the checksum routine must be very fast. I avoided the cryptographic algorithms (MD5, SHA1) because they involve a lot of computations, and I chose CRC32 (a simpler and obvious algorithm).

After making some tests, I found CRC32 too slow regarding my project constraints. I used enwik9 (10^9 bytes text dump of wikipedia) from here. I compressed it using my LZSS routine and obtained a 570Mb file. I measured the following durations (single threaded, disk IO excluded, all data loaded in memory before processing, average of 10 trials) :

|          Operation            |  Time (GCC4.4.5/Linux)   |  Time (MSVC2010/Win7)  |
|-------------------------------+--------------------------+------------------------|
|        Decompression          |        6.8 seconds       |      6.95 seconds      |
|  CRC32 on decompressed result |        4.9 seconds       |      4.62 seconds      |
|   CRC32 on compressed result  |        2.8 seconds       |      2.69 seconds      |

Then I tested SuperFastHash, just by curiosity :

|          Operation            |  Time (GCC4.4.5/Linux)   |  Time (MSVC2010/Win7)  |
|-------------------------------+--------------------------+------------------------|
|  SFH on decompressed result   |        1.1 seconds       |      1.33 seconds      |
|   SFH on compressed result    |        0.7 seconds       |      0.75 seconds      |

And here is my CRC32 implementation (I followed the descriptions from the following document : http://www.ross.net/crc/download/crc_v3.txt) :

# include <stdint.h>

// CRC32 lookup table (corresponding to the polynom 0x04C11DB7)
static const uint32_t  crc32_lookup_table[256] =
{
    0x00000000, 0x77073096, 0xEE0E612C, 0x990951BA,
    0x076DC419, 0x706AF48F, 0xE963A535, 0x9E6495A3,
    0x0EDB8832, 0x79DCB8A4, 0xE0D5E91E, 0x97D2D988,
    // many lines skipped
    // ...
    0xB40BBE37, 0xC30C8EA1, 0x5A05DF1B, 0x2D02EF8D
} ;

uint32_t crc32_hash(const uint8_t * data, size_t len)
{
    uint32_t crc32_register = 0xFFFFFFFF ;
    while( len-- )
    {
        crc32_register = (crc32_register >> 8)
                       ^ crc32_lookup_table[(crc32_register & 0x000000FF) ^ *data++] ;
    }
    return crc32_register ^ 0xFFFFFFFF ;
}

My question is :

Can I use a hash instead of a cyclic redundancy check value to perform error detection in compressed data blocks ? As far as I know (and I remember from my electronics course), CRC algorithms are designed to be
very efficient when errors occur in bursts when data is transmitted over a noisy channel, which is not the case of data read from hard drives. Please correct me if I’m wrong.

Thanks for any advice !

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T13:10:01+00:00Added an answer on May 25, 2026 at 1:10 pm

    Since your problem is not about security, you can use "broken" cryptographic hash functions, which are not secure against a sentient attacker, but still very good at detecting transmission errors. I am thinking about MD4, which has been measured to be faster than CRC32 on some platforms. You may also want to check RadioGatún and Panama; see this library for opensource implementations in C and Java of various cryptographic hash functions.

    If your target architecture is a recent/big enough x86 CPU which features the AES-NI instructions, then you could make a devilishly fast and very good checksum by simply computing a CBC-MAC with block cipher AES and a conventional key (e.g. an all-zero key); since this is not for security, you could even use less rounds than standard AES (e.g. 5 rounds instead of the standard 10).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Note: purely out of curiosity and not for any actual use case. I'm wondering
(Note: I chose to not use the Navigation Framework) I have a WizardViewModel which
NOTE: XMLIgnore is NOT the answer! OK, so following on from my question on
NOTE: I am not set on using VI, it is just the first thing
Note that I am not asking which to choose (MVC or MVP), but rather
(Note: This is for MySQL's SQL, not SQL Server.) I have a database column
Note This is not a REBOL-specific question. You can answer it in any language.
NOTE: I am a new user so I'm not allowed to post images nor
Note that this relates to focus and blur events on a window, not on
Note: I'm not talking about preventing the rebinding of a variable. I'm talking about

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.