I find myself writing a simple program to extract data from a bmp file. I just got started and I am at one of those WTF moments.
When I run the program and supply this image: http://www.hack4fun.org/h4f/sites/default/files/bindump/lena.bmp
I get the output:
type: 19778
size: 12
res1: 0
res2: 54
offset: 2621440
The actual image size is 786,486 bytes. Why is my code reporting 12 bytes?
The header format specified in,
http://en.wikipedia.org/wiki/BMP_file_format matches my BMP_FILE_HEADER structure. So why is it getting filled with wrong information?
The image file doesn’t appear to be corrupt and other images are giving equally wrong outputs. What am I missing?
#include <stdio.h>
#include <stdlib.h>
typedef struct {
unsigned short type;
unsigned int size;
unsigned short res1;
unsigned short res2;
unsigned int offset;
} BMP_FILE_HEADER;
int main (int args, char ** argv) {
char *file_name = argv[1];
FILE *fp = fopen(file_name, "rb");
BMP_FILE_HEADER file_header;
fread(&file_header, sizeof(BMP_FILE_HEADER), 1, fp);
if (file_header.type != 'MB') {
printf("ERROR: not a .bmp");
return 1;
}
printf("type: %i\nsize: %i\nres1: %i\nres2: %i\noffset: %i\n", file_header.type, file_header.size, file_header.res1, file_header.res2, file_header.offset);
fclose(fp);
return 0;
}
There are two mistakes I could find in your code.
First mistake: You have to pack the structure to 1, so every type size is exactly the size its meant to be, so the compiler doesn’t align it for example in 4 bytes alignment. So in your code,
short, instead of being 2 bytes, it was 4 bytes. The trick for this, is using a compiler directive for packing the nearest struct:Now it should be aligned properly.
The other mistake is in here:
You are trying to check a
shorttype, which is 2 bytes, with achartype (using''), which is 1 byte. Probably the compiler is giving you a warning about that, it’s canonical that single quotes contain just 1 character with 1-byte size.To get this around, you can divide this 2 bytes into 2 1-byte characters, which are known (
MandB), and put them together into aword. For example:If you see this expression, this will happen:
'M'(which is0x4Din ASCII) shifted 8 bits to the left, will result in0x4D00, now you can just add or or the next character to the right zeroes:0x4D00 | 0x42 = 0x4D42(where0x42is'B'in ASCII). Thinking like this, you could just write:Then your code should work.