I am reading log files but not all lines want to be processed straight away. I am using a queue / buffer to store the lines while they wait to be processed.
This queue is regularly scanned for particular lines – when they are found, they are removed from the queue (they can be anywhere in it). When there isn’t a particular line to be found, lines are taken out of the start of the queue one by one to be processed.
Therefore, the queue needs the following:
- Able to be resized (or give that impression)
- Have elements removed from anywhere
- Have elements added (will always be at the end of the queue)
- Be scanned quickly
- Depending on performance, have a pointer of where it got to on the last scan.
I initially wrote the code when I had little experience of Java or the API, and just used an ArrayList because I knew it would work (not necessarily because it was the best option).
Its performance is now becoming poor with more and more logs needing to be processed – so, what collection would you recommend to be used in this situation? There’s always the possibility of writing my own too.
Thanks
LinkedHashSet might be of interest. It is effectively a HashSet but it also maintains a LinkedList to allow a predictable iteration order – and therefore can also be used as a FIFO queue, with the nice added benefit that it can’t contain duplicate entries.
Because it is a HashSet too, searches (as opposed to scans) can be O(1) if they can match on
equals()