Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6577451
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T15:38:15+00:00 2026-05-25T15:38:15+00:00

I am testing my new image file format, which without going into unnecessary detail

  • 0

I am testing my new image file format, which without going into unnecessary detail consists of the PPM RGB 24-bit per pixel format sent through zlib’s compression stream, and an 8 byte header appended to the front.

While I was writing up tests to evaluate the performance of the corresponding code which implements this I had one test case which produced pretty terrible results.

unsigned char *image = new unsigned char[3000*3000*3];
for(int i=0;i<3000*3000;++i) {
    image[i*3] = i%255;
    image[i*3+1] = (i/2)%255;
    image[i*3+2] = (i*i*i)%255;
}

Now what I’m doing here is creating a 3000×3000 fully packed 3 byte per pixel image, which has red and green stripes increasing steadily, but the blue component is going to be varying quite a bit.

When I compressed this using the zlib stream for my .ppmz format, it was able to reduce the size from 27,000,049 bytes (the reason it is not an even 27 million is 49 bytes are in the headers) to 25,545,520 bytes. This compressed file is 94.6% the original size.

This got me rather flustered at first because I figured that even if the blue component was so chaotic it couldn’t be helped much, at least the red and green components repeated themselves quite a bit. A smart enough compressor ought to be able to shrink to about 1/3 the size…

To test that, I took the original 27MB uncompressed file and RAR’d it, and it came out to 8,535,878 bytes. This is quite good, at 31.6%, even better than one-third!

Then I realized I made a mistake defining my test image. I was using mod 255 when I should be clamping to 255, which is mod 256:

unsigned char *image = new unsigned char[3000*3000*3];
for(int i=0;i<3000*3000;++i) {
    image[i*3] = i%256;
    image[i*3+1] = (i/2)%256;
    image[i*3+2] = (i*i*i)%256;
}

The thing is, there is now just one more value that my pixels can take, which I was skipping previously. But when I ran my code again, the ppmz became a measly 145797 byte file. WinRAR squeezed it into 62K.

Why would this tiny change account for this massive difference? Even mighty WinRAR couldn’t get the original file under 8MB. What is it about repeating values every 256 steps that doing so every 255 steps completely changes? I get that with the %255 it makes the first two color components’ patterns slightly out of phase, but behavior is hardly random. And then there’s just crazy modular arithmetic being dumped into the last channel. But I don’t see how it could account for such a huge gap in performance.

I wonder if this is more of a math question than a programming question, but I really don’t see how the original data could contain any more entropy than my newly modified data. I think the power of 2 dependence indicates something related to the algorithms.

Update: I’ve done another test: I switched the third line back to (i*i*i)%255 but left the others at %256. ppmz compression ratio rose a tiny bit to 94.65% and RAR yielded a 30.9% ratio. So it appears as though they can handle the linearly increasing sequences just fine, even when they are out of sync, but there is something quite strange going on where arithmetic mod 2^8 is a hell of a lot more friendly to our compression algorithms than other values.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T15:38:15+00:00Added an answer on May 25, 2026 at 3:38 pm

    Mystical has a big part of the answer, but it also pays to look at the mathematical properties of the data itself, especially the blue channel.

    (i * i * i) % 255 repeats with a period of 255, taking on 255 distinct values all equally often. A naïve coder (ignoring the pattern between different pixels, or between the R and B pixels) would need 7.99 bits/pixel to code the blue channel.

    (i * i * i) % 256 is 0 whenever i is a multiple of 8 (8 cubed is 512, which is of course 0 mod 256);
    It’s 64 whenever i is 4 more than a multiple of 8;
    It’s 192 whenever i is 4 less than a multiple of 8 (together these cover all multiples of 4);
    It’s one of 16 different values whenever i is an even non-multiple of 4, depending on i‘s residue mod 64.
    It takes on one of 128 distinct values whenever i is odd.

    This makes for only 147 different possibilities for the blue pixel, with some occuring much more often than others, and a naïve entropy of 6.375 bits/pixel for the blue channel.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm currently looking into unit testing for a new application I have to create.
I'm testing a new iPhone app with my end-users and found that a lot
I am now testing the new feature of MVC 2 Preview 2 called Areas
I'm currently testing a new version of an app of mine on OSX 10.5
I'm testing the new template for a website that I'm working for, and on
I've testing some new CLR 4.0 behavior in method inlining (cross-assembly inlining) and found
I am new to testing and mocking. I'm trying to test a business logic
I am new to testing.I have to test some C# classes.Kindly let me know
When we are developing new sites or testing changes in new ones that involve
I'm pretty new to testing and mocking and i am trying to write a

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.