I am trying to write a program (prob in java) to join a number of jpegs together losslessly without decoding them first.
I thought I’d start simple and try and append 2 jpegs of the same size compressed with the same settings one above the other using a hex editor.
First I extract the image data of jpeg B and append it to jpeg A. By modifying the dimensions specified in the headers I get a new recognizable picture (jpeg A + jpeg B appended in the y axis) which can be diplayed. However, although the image data from jpeg B is clearly recognizable it seems to have lost a lot of colour information and is clearly incorrect.
So my question is what steps am I missing out here? I don’t think there are any other dimension specific header values I need to change, so maybe I need to huffman decode the image data from both jpegs, then append them together and then reencode the lot?
I’ve spent some time reading up on jpeg specs and headers etc but to be honest I’m out of my depth and could really do with a pointer or two!
Thanks a lot for any help.
Thanks for all the suggestions. Yes this is definitely possible, I should have mentioned jpegtran in my original question. I am basically trying to replicate this aspect of jpegtran functionality but use it in my own program. I guess I should look at the jpegtran source but I know nothing about C and not very much about programming in general so reverse engineering source code is easier said than done!
Ok I worked out where I was going wrong.
1) the image scan data is saved in bytes, but the actual important info is encoded as variable length bit strings. This means that the end of the actual image data does not necessarily fall on a byte boundary. When the jpeg encoder needs to pad out the number of bits to make the byte boundary it simply adds a series of 1s.
2) the way the actual pixel info is stored is a little too complicated (at least for me) to explain, but basically everything is encoded within MCU, minimal coding units or something. These vary in size depending on the chroma subsampling, horizontal and vertical sizes being either 8 or 16 pixels. For each MCU, there are DC and AC parts that make up a single component of Luminance, Y, or chrominance, Cb and Cr. The problem was that the DC components are stored as values in relation to the relevant DC value of the previous MCU. So when I added the new image data from jpg B, it had stored its DC values in relation to 0 (because there were no previous MCUs), but it needed to take into account the final DC values of the last MCU from jpg A. (hope that makes sense).
The solution:
You need to do an initial decode (Huffman + runlength) of the image data to find out exactly where the image data ends and then strip the trailing 1s. You also need to change the initial DC values in the second jpg appropriately. You then need to reencode the appropriate bits, add 1s to fit to a byte boundary, et voila.
If you want to append in the x-axis, its a little more complicated. You have to rearrange the MCUs so that they scan in the right order. Jpgs scan from left to right, then top to bottom and then adjust the DC values appropriately.
So far I’ve only tested this on single MCU jpgs, but theoretically it should work with bigger ones too.
BTW I only worked this out thanks to the owner of this excellent jpg related resource/blog