Assume I have a regular fixed width file that is sorted on one of the fields. Given that I know the length of the records, I can use lseek to implement a binary search to find records with fields that match a given value without having to read the entire file.
Now the difficulty is that the file is gzipped. Is it possible to do this without completely inflating the file? If not with gzip. is there any compression that supports this kind of behavior?
This is totally impossible with a file compressed with zip and derivatives. Those are based on a rolling dictionary window, typically with some sort of buffer-based compression of the most significant bits of the output codes on top of that. Bottom line is that a particular sequence of bytes in a zip file is meaningless without context.
If you want to be able to randomly read a particular record out of a compressed file, you have to compress each record independently and then have an index into the file. Depending on your data, this would probably render the compression step worthless.