I have a Clojure network application, basic structure is like this:
- Server has one LinkedBlockingQueue or ArrayBlockingQueue (I have tried both)
- Multiple threads accept network connections, and
offerwork to the queue - One thread
takefrom the queue in an infinite loop and work on each item taken
And I have noticed severe performance issue with take call:
- Threads are
offering to the queue at a very fast rate, and the queue takes them all very quickly - The one worker thread
takefrom the queue at a very very slow rate (more than 200 times slower than the speed ofoffer) - CPU usage is very very low – so the worker is not busy at all
Without using the queue, in a benchmark situation, the same workload is able to be maximize CPU usage and be done at a satisfactory speed.
So what is the best queuing technique to use in this scenario?
Here’s my code (less than 100 lines);
https://github.com/HouzuoGuo/Aurinko/blob/master/src/Aurinko/core.clj
Edit, details of my observation:
- I benchmarked request processing speed, it works at approximately 8,000 requests per second without using a queue.
- I made the server program to print a debug message when it queues a request, and another message when it finishes processing a request.
- I made a simple client program to send approximately 1,000 requests per second to the server.
- The server queues all the requests in time, and the queue becomes many thousands of elements long.
- Worker (request processor) appears to be working at only about 150 requests per second, according to the debug messages.
Edit:
Thanks for everyone’s help. I have confirmed that blocking queue is not the thing causing the performance issue. Although I have not found the performance bottleneck in my application, but there has to be one somewhere.
Final edit:
Thank you everyone. The performance bottleneck was caused by network IO rather than the blocking queue.
You state: “CPU usage is very very low – so the worker is not busy at all”. You also say: “I have confirmed that blocking queue is not the thing causing the performance issue. Although I have not found the performance bottleneck in my application, but there has to be one somewhere.”
If both of those statements are true, it might be that your worker thread spends a lot of time waiting on I/O. If so, there is a simple solution: run more than one worker thread!
Or it may be that there is some other concurrency bottleneck (not the work queue).
Why don’t you do the following: make a little test program which pushes about 1,000 items on the work queue, and then starts running the same code which runs on the worker thread. When the queue is empty, it should exit. Profile that program. (Do you have a profiler set up on your dev machine? I like using JIP.)