I have computer with 4 cores and OMP application with 2 weighty tasks.
int main()
{
#pragma omp parallel sections
{
#pragma omp section
WeightyTask1();
#pragma omp section
WeightyTask2();
}
return 0;
}
Each task has such weighty part:
#omp pragma parallel for
for (int i = 0; i < N; i++)
{
...
}
I compiled program with -fopenmp parameter, made export OMP_NUM_THREADS=4.
The problem is that only two cores are loaded. How I can use all cores in my tasks?
My initial reaction was: You have to declare more parallelism.
You have defined two tasks that can run in parallel. Any attempt by OpenMP to run it on more than two cores will slow you down (because of cache locality and possible false sharing).
Edit If the parallel for loops are of any significant volume (say, not under 8 iterations), and you are not seeing more than 2 cores used, look at
omp_set_nested()the
OMP_NESTED=TRUE|FALSEenvironment variable