I’m using OpenCV’s cascade classifier for detection, however my CPU utilization never goes above 50% yet the application runs only at ~8 FPS so there should be a lot more room for improvement. I’ve installed OpenCV with TBB. My own program doesn’t use any multithreading, it’s only on OpenCV’s part (detectMultiscale function). All CPU cores are at around 40%. I’ve tried setting the program’s priority to realtime, but that didn’t help. Could there be a bottleneck of some sort I am not aware of?
Build details:
I’m using Visual Studio 2010 IDE. Currently using these optimizations: Optimization: Maximize Speed (/O2), Inline Function Expansion: Default, Enable Intrinsic Functions: Yes (/Oi), Favor fast code (/Ot), Omit frame pointers: Yes (/Oy), Enable Fiber-Safe Optimizations: No, Whole Program Optimization: Yes (/GL). I’m on Windows 7 64bit, built the program on release mode as 64 bit.
Maybe you have an intel processor with hyperthreading technology. (2 threads per core), and TBB is smart enough to use only one thread per core (usually it’s better than with two). And the operating system reports half the available power.
EDIT
if you want to modify the classifier by yourself, you can call setNumThreads(4); and then map affinities, and you will have 100% per core, instead of the average 80%, as explained in comments
What you see is the difference between marketing stuff (8 cores!!) and the truth(~3 cores)