I have a large data set (200GB uncompressed, 9GB compressed in bz2 -9 )

Question

0

Editorial Team

Asked: June 11, 20262026-06-11T04:03:45+00:00 2026-06-11T04:03:45+00:00

I have a large data set (200GB uncompressed, 9GB compressed in bz2 -9 )

0

I have a large data set (200GB uncompressed, 9GB compressed in bz2 -9 ) of stock tick data.

I want to run some basic time series analysis on them.

My machine has 16GB of RAM.

I would prefer to:

keep all data, compressed, in memory
decompress that data on the fly, and stream it [so nothing ever hits disk]
do all analysis in memory

Now, I think there’s nice interactions here with Clojure’s laziness, and future objects (i.e. I can define objects s.t. when I try to access them, I’ll decompress them on the fly.)

Question: what are the things I should keep in mind when doing high performance time series analysis in Clojure?

I’m particular interested in tricks involving:

efficiently storing tick data in memory
efficiently doing computation
weird convolutions to reduce # of passes over the data

Books / articles / research paper suggestions welcome. (I’m a CS PhD student).

Thanks.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T04:03:47+00:00

Some ideas:

In terms of storing the compressed data, I don’t think you will be able to do much better than your OS’s own file system caching. Just make sure it s configured to use 11GB+ of RAM for file system caching and it should pull your whole compressed data set into memory as it is read the first time.
You should then be able to define your Clojure code to pull into the data lazily via a ZipInputStream, which will perform the decompression for you.
If you need to perform a second pass on the data, just create a new ZipInputStream on the same file. OS level caching should ensure that you don’t hit the disk again.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large data set (200GB uncompressed, 9GB compressed in bz2 -9 )

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply