Note – may be more related to computer organization than software, not sure.
I’m trying to understand something related to data compression, say for jpeg photos. Essentially a very dense matrix is converted (via discrete cosine transforms) into a much more sparse matrix. Supposedly it is this sparse matrix that is stored. Take a look at this link:
http://en.wikipedia.org/wiki/JPEG
Comparing the original 8×8 sub-block image example to matrix “B”, which is transformed to have overall lower magnitude values and much more zeros throughout. How is matrix B stored such that it saves much more memory over the original matrix?
The original matrix clearly needs 8×8 (number of entries) x 8 bits/entry since values can range randomly from 0 to 255. OK, so I think it’s pretty clear we need 64 bytes of memory for this. Matrix B on the other hand, hmmm. Best case scenario I can think of is that values range from -26 to +5, so at most an entry (like -26) needs 6 bits (5 bits to form 26, 1 bit for sign I guess). So then you could store 8x8x6 bits = 48 bytes.
The other possibility I see is that the matrix is stored in a “zig zag” order from the top left. Then we can specify a start and an end address and just keep storing along the diagonals until we’re only left with zeros. Let’s say it’s a 32-bit machine; then 2 addresses (start + end) will constitute 8 bytes; for the other non-zero entries at 6 bits each, say, we have to go along almost all the top diagonals to store a sum of 28 elements. In total this scheme would take 29 bytes.
To summarize my question: if JPEG and other image encoders are claiming to save space by using algorithms to make the image matrix less dense, how is this extra space being realized in my hard disk?
Cheers
As it says in “Entropy coding” a zig-zag pattern is used, together with RLE which will already reduce size for many cases. However, as far as I know the DCT isn’t giving a sparse matrix per se. But it usually enhances the entropy of the matrix. This is the point where the compressen becomes lossy: The intput matrix is transferred with DCT, then the values are quantizised and then the huffman-encoding is used.