I need to store large amount of data into the database (MySQL). I would like to save disk space by compressing the text data before storing it to the database.
I know there will be a performance hit for compressing/decompressing data. But I am going to cache the decompressed data on CDN. And mostly, the data will not become stale for months or even years.
Can you please refer me some good compression/decompression techniques? I am also open to other alternatives than compressing/decompressing data.
If you want a pure MySQL solution, you could always try using the
ARCHIVEstorage type for your table. The documentation describes it as an insert-only, no update type of engine meant specifically for what you describe, stashing away things that won’t change for years.To do the same thing in a conventional engine would require using
zlibon your data streams, but remember that compression performs very poorly on already compressed data such as most popular image types or video. You express your requirements as mostly text, which usually compresses quite well.Ruby has
Zlib::Deflatewhich can compress and expand data on demand. You could write your own wrapper similar to the JSON one by implementing theencodeanddecodemethods on your module.One thing to consider is you can probably store the compressed data on your CDN so long as you can be sure your client supports
gzipencoding. I don’t know of any major browsers that don’t, as asset compression has become quite standard, especially in the mobile space.