I need a system that writes sequentially, very fast to a synchronized persistent queue and reads very fast from it. The queue can spike and grow to hundred of millions, possible billions of entries on certain days and it could take days to catchup afterwards, which is fine.
I’m thinking of something like this:
Receiving multiple requests containing numbers 0-4 (order doesn’t matter)
Queue[10]: [ 0 1 2 3 4 _ _ _ _ _ ]
The queue is also written to a file (f0) concomitant to being submitted to the queue. The reason for this is so in case of failures I’m not loosing data.
While I’m reading from it in sequence (0, 1, 2) more numbers are posted:
Queue[10]: [ _ _ _ 3 4 5 6 7 _ _ ]
At this point my file f0 contains (0-7) and I have also persisted the last position read.
If I’m continuing to write and the current queue gets full, the next 10 writes go directly into a file f1. The next 10 will go into f2 and so on. When the reading finished reading all entries from the queue f1 will be loaded into the queue and the reading will continue. f0 will be deleted. When my reads catch up with the writes the current file will be read into the queue and will continue from that point.
I can probably provide a better implementation by using a primary/secondary buffer.
However I would prefer to use an existing library if there is one that does what I need.
Any help would be greatly appreciated.
Sebi
I have a library which supports a persisted queue which can support 5 – 20 million entires per second sustained and can have number of entires between the single producer and multiple consumers (they don’t even have to be running at the same time) It doesn’t carry any GC overhead.
https://github.com/peter-lawrey/Java-Chronicle
The library requires 64-bit JVM if you want much scalability and it is limited by the amount of disk space you have.
The library assumes you will cycle the files used as a maintenance task. This requires you have sufficient disk space to cover the period between maintenance windows.