I’m reading binary data from a file, specifically from a zip file. (To know more about the zip format structure see http://en.wikipedia.org/wiki/ZIP_%28file_format%29)
I’ve created a struct that stores the data:
typedef struct {
/*Start Size Description */
int signatute; /* 0 4 Local file header signature = 0x04034b50 */
short int version; /* 4 2 Version needed to extract (minimum) */
short int bit_flag; /* 6 2 General purpose bit flag */
short int compression_method; /* 8 2 Compression method */
short int time; /* 10 2 File last modification time */
short int date; /* 12 2 File last modification date */
int crc; /* 14 4 CRC-32 */
int compressed_size; /* 18 4 Compressed size */
int uncompressed_size; /* 22 4 Uncompressed size */
short int name_length; /* 26 2 File name length (n) */
short int extra_field_length; /* 28 2 Extra field length (m) */
char *name; /* 30 n File name */
char *extra_field; /*30+n m Extra field */
} ZIP_local_file_header;
The size returned by sizeof(ZIP_local_file_header) is 40, but if the sum of each field is calculated with sizeof operator the total size is 38.
If we have the next struct:
typedef struct {
short int x;
int y;
} FOO;
sizeof(FOO) returns 8 because the memory is allocated with 4 bytes every time. So, to allocate x are reserved 4 bytes (but the real size is 2 bytes). If we need another short int it will fill the resting 2 bytes of the previous allocation. But as we have an int it will be allocated plus 4 bytes and the empty 2 bytes are wasted.
To read data from file, we can use the function fread:
ZIP_local_file_header p;
fread(&p,sizeof(ZIP_local_file_header),1,file);
But as there’re empty bytes in the middle, it isn’t read correctly.
What can I do to sequentially and efficiently store data with ZIP_local_file_header wasting no bytes?
C
structs are just about grouping related pieces of data together, they do not specify a particular layout in memory. (Just as the width of anintisn’t defined either.) Little-endian/Big-endian is also not defined, and depends on the processor.Different compilers, the same compiler on different architectures or operating systems, etc., will all layout structs differently.
As the file format you want to read is defined in terms of which bytes go where, a struct, although it looks very convenient and tempting, isn’t the right solution. You need to treat the file as a
char[]and pull out the bytes you need and shift them in order to make numbers composed of multiple bytes, etc.