Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7645239
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 31, 20262026-05-31T09:53:26+00:00 2026-05-31T09:53:26+00:00

Would it be possible/practical to create a compression algorithm that splits a file into

  • 0

Would it be possible/practical to create a compression algorithm that splits a file into chunks and then compares those chunks against an enormous (100GB?, 200GB?) psuedo-random file?

The resulting “compressed” file would contain an ordered list of offsets and lengths. Everyone using the algorithm would need the same enormous file in order to compress/decompress files.

Would this work? I assume someone else has thought of this before and tried it but it’s a tough one to Google.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-31T09:53:28+00:00Added an answer on May 31, 2026 at 9:53 am

    It’s a common trick, used by many compression “claimers”, which regularly announce “revolutionary” compression ratio, up to ridiculous levels.

    The trick depends, obviously, on what’s in the reference dictionary.

    If such a dictionary is just “random”, as suggested, then it is useless. Simple math will show that the offset will cost, on average, as much as the data it references.

    But if the dictionary happens to contain large parts or the entire input file, then it will be “magically” compressed to a reference, or series of references.

    Such tricks are called “hiding the entropy”. Matt Mahoney wrote a simple program (barf) to demonstrate this technique, up to the point of reducing anything to 1 byte.

    The solution to this trickery is that a comparison exercise should always include the compressed data, the decompression program, and any external dictionary it uses. When all these elements are counted in the equation, then it’s no longer possible to “hide” entropy anywhere. And the cheat get revealed….

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Would it be possible to add a new operator to the String class that
Would it be possible rollback transactions using Transactionlog file for a particular record? I
Would it be possible to write a script that gave the user the ability
Is it possible to call an abstract constructor in one method, then pass that
I was tasked to create code that would fetch data from database using data
I would like to create a service that when started, initializes a connection to
I would like to create a method that returns an XmlReader. Depending on the
i want to create a virtual monitor. The way this would work is that
I would like to know if it would be possible to replicate the effect
I was wondering if it would be possible to do this: $var = '$something

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.