Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6232057
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 24, 20262026-05-24T10:02:59+00:00 2026-05-24T10:02:59+00:00

I am working with a system that compresses large files (40 GB) and then

  • 0

I am working with a system that compresses large files (40 GB) and then stores them in an archive.

Currently I am using libz.a to compress the files with C++ but when I want to get data out of the file I need to extract the whole thing. Does anyone know a compression component (preferably .NET compatible) that can store an index of original file positions and then, instead of decompressing the entire file, seek to what is needed?

Example:

Original File       Compressed File
10 - 27         =>  2-5
100-202         =>  10-19
..............
10230-102020    =>  217-298

Since I know the data I need in the file only occurs in the original file between position 10-27, i’d like a way to map the original file positions to the compressed file positions.

Does anyone know of a compression library or similar readily available tool that can offer this functionality?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-24T10:03:00+00:00Added an answer on May 24, 2026 at 10:03 am

    I’m not sure if this is going to help you a lot, as the solution depends on your requirements, but I had similar problem with project I am working on (at least I think so), where I had to keep many text articles on drive and access them in quite random manner, and because of size of data I had to compress them.

    Problem with compressing all this data at once is that, most algorithms depends on previous data when decompressing it. For example, popular LZW method creates adictionary (an instruction on how to decompress data) on run, while doing the decompression, so decompressing stream from the middle is not possible, although I believe those methods might be tuned for it.

    Solution I have found to be working best, although it does decrease your compression ratio is to pack data in chunks. In my project it was simple – each article was 1 chunk, and I compressed them 1 by 1, then created an index file that kept where each “chunk” starts, decompressing was easy in that case – just decompress whole stream, which was one article that I wanted.

    So, my file looked like this:

    Index; compress(A1); compress(A2); compress(A3)

    instead of

    compress(A1;A2;A3).

    If you can’t split your data in such elegant manner, you can always try to split chunks artificially, for example, pack data in 5MB chunks. So when you will need to read data from 7MB to 13MB, you will just decompress chunks 5-10 and 10-15.
    Your index file would then look like:

    0     -> 0
    5MB   -> sizeof(compress 5MB)
    10MB  -> sizeof(compress 5MB) + sizeof(compress next 5MB)
    

    The problem with this solution is that it gives slightly worse compression ratio. The smaller the chunks are – the worse the compression will be.

    Also: Having many chunks of data don’t mean you have to have different files in hard drive, just pack them after each other in 1 file and remember when they start.

    Also: http://dotnetzip.codeplex.com/ is a nice library for creating zip files that you can use to compress and is written in c#. Worked pretty nice for me and you can use its built functionality of creating many files in 1 zip file to take care of splitting data into chunks.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm working on a system that includes a large number of reports, generated using
I'm currently working on a system that is using handlebars.js template's. It's working for
We're currently working with a system (for better or worse) that declares a doctype
I am working on a system that is recognizing paper documents using OCR engines.
I am working on a system that will have several hundred thousand XML files,
I'm using PHP 4.3.9, Apache/2.0.52 I'm trying to get a login system working that
Currently, I am working on system that does quite a bit of reporting-style functions
I am currently working on a system that involves storing multiple studies and details
I'm currently working on a system that in some cases will need to run
I'm working on a system that mirrors remote datasets using initials and deltas. When

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.