Me and a friend were debating whether the number of threads in a threadpool should be processor count + 1 or just processor count.
I chose just processor count because there is a even number of threads that can be distributed to each processor, and he chose processor count + 1 because he thinks it’ll help him optimize performance.
Who is correct?
Neither of you.
The correct answer is: you need a number of threads that produces the best overall result.
Each thread you add has a cost. It’s a small cost but it’s definitely a cost. Put 30,000 threads in a thread pool and watch your system grind to a halt (if they’re all doing something).
But each thread notionally saves you some amount of (overall) time.
Conceptually you can plot that relationship and produce a range of threads that will give you the desired result given certain resource constraints. The correct number of threads is somewhere in that range.
Note: I said “overall result”. That’s not necessarily the “fastest result”. One thread may do a task in 10 minutes. 100 threads may do it in 15 seconds. 1000 threads may do it in 10 seconds because you’re starting to hit contention limits for your particular problem. Is that a better overall result? There might be no real difference between 10 and 15 seconds but 1000 threads may use way more memory.
Remember that not all threads are CPU bound so it makes perfect sense in a lot of situations to have way more threads than the number of cores because at some point in performing a task a thread may sleep while it waits for something to happen (network communication, disk read, whatever).