I am trying to read a gz file from a webserver through a python script. The size of the source file is 60Mb or more. I dont want to wait for the whole file to be read to decompress and read the contents. Rather I want to decompress information as and when I receive a few bytes.
I tried doing that, but I get turned off by errors like “CRC check failed “. I am using gzip module , as the server returns the content-encoding as “gzip”. I have also tried my luck with zlib, but no fruit.
I have seen Mozilla Firefox or Google Chrome doing the above without any issues . I watched the HTTP headers , and I see that the content is not received all at once, but the browser is able to show decompressed partial data as and when it receives it. How do they do it ? Any help is appreciated.
I am trying to read a gz file from a webserver through a python
Share
use
zlib.decompressobjwith thewbitsparameter 31. Thendeobj.decompress()will allow you to decompress gzip input a piece at a time.The 31 does not mean 31 bits. It is really 15 + 16, where 15 represents the maximum size of the sliding window of 2^15 bytes, and the 16 is an option to request gzip format decoding. Without adding 16, the zlib format will be decoded, which will reject gzip input.