I have a dual core processor and according to the explanation I’m able to use only 2 threads but actually I’m able to launch more than 2 threads at same time:
Here is a copy of the explanation:
The static hardware_concurrency() method, provided by the
boost::thread class, returns the number of threads that could
physically be executed at the same time based on the underlying number
of CPUs or CPU cores. Calling this function on a commonly used
dual-core machine, a value of 2 is returned. This allows for a simple
method to identify the theoretical maximum number of threads that
should be used simultaneously by a given multithreaded application.
hardware_concurrency() method returns number 2 in my case, but this program uses 4 threads at same time:
#include <iostream>
#include <boost\thread.hpp>
using namespace std;
using boost::thread;
using namespace boost::this_thread;
using boost::posix_time::seconds;
void f1()
{
for(int i = 0; i < 10; ++i)
{
cout << i << endl;
sleep(seconds(2));
}
}
void f2()
{
for(int i = 0; i < 10; ++i)
{
cout << i << endl;
sleep(seconds(2));
}
}
int main()
{
// 4 threads are executed on dual core machine (no problem)
thread thr1(f1);
thread thr2(f2);
thread thr3(f1);
thread thr4(f2);
cin.ignore();
return 0;
}
Can anyone explain that behavior?
The term threads usually covers three abstraction layers:
The 4 threads you said are launched by the application are from category 1 (user threads), while the value 2 returned by that function refers to category 3 (hardware threads). Since the mapping is N:M across the layers, you can see that you can have several user threads mapped to a smaller number of hardware threads.
Having said this, typically starting more than 2x the number of hardware threads if you are doing intensive computations will hurt performance due to context switches and resource contention.