I’m writing a performance-critical .NET application which makes heavy use of multithreading.
Using the Visual Studio performance profiler, the top functions with Exclusive samples are:
WaitHandle.WaitAny() – 14.23%
@JIT_MonReliableEnter@8 – 7.76%
Monitor.Enter – 5.09%
Basically, my top 3 functions are working with threading primitives and out of my control to some extent I believe. My work/processing routines are pretty small in comparison and I’m trying to increase performance. I believe the algorithms involved are pretty sound, although I am reviewing them fairly frequently.
My questions are:
- If there are 14.23% of CPU samples in these methods – is the CPU effectively ‘idle’ for most of those samples, i.e. just waiting on other threads? Or is the idle part of the thread-waits not shown as a part of the profile trace [and the 27.08% shown in these 3 the sum of all overhead within those sync methods]? (I can guess that this is mostly idle, but would appreciate some decent reference material behind answers to this one please)
- I have reviewed my locking schemes, however do these results indicate some particular bottleneck or technique I should look into for further optimisation?
- Is
WaitAnyquite poor in particular? I use it heavily to check whether particular queue objects are readable/writable, but also checking an abort flag at the same time. Is there a better way to do that?
Your CPU isn’t necessarily idle when a thread is in a
WaitHandle.WaitAnyor aMonitor.Enter. A thread that’s in a wait is idle, but presumably other threads are busy executing. This is especially true ofMonitor.Enter. If a thread is blocked on a lock, then one would hope the thread that has that lock is executing code rather than sitting idle.Also, if your thread is using the
WaitAnyto read from a queue, then it’s likely that the queue simply doesn’t have anything in it. That’s not a performance problem for the consumer code. It just means that the producer isn’t putting things into the queue fast enough. Now, that might be because the producer is slow, or because data isn’t coming in fast enough.If you’re processing data faster than it can come in, then it doesn’t look like you have a performance problem. Certainly not on the consumer side.
As far as using
WaitAnyfor queuing, I would suggest that you use BlockingCollection and the methods that take a cancellation token, like TryAdd(T, Int32, CancellationToken). Converting to cancellation tokens really simplified my multi-threaded queuing code.