I have a machine with 8 processors. I want to alternate using OpenMP and MPI on my code like this:
OpenMP phase:
- ranks 1-7 wait on a MPI_Barrier
- rank 0 uses all 8 processors with OpenMP
MPI phase:
- rank 0 reaches barrier and all ranks use one processor each
So far, I’ve done:
- set I_MPI_WAIT_MODE 1 so that ranks 1-7 don’t use the CPU while on the barrier.
- set omp_set_num_threads(8) on rank 0 so that it launches 8 OpenMP threads.
It all worked. Rank 0 did launch 8 threads, but all are confined to one processor. On the OpenMP phase I get 8 threads from rank 0 running on one processor and all other processors are idle.
How do I tell MPI to allow rank 0 to use the other processors? I am using Intel MPI, but could switch to OpenMPI or MPICH if needed.
Thanks all for the comments and answers. You are all right. It’s all about the “PIN” option.
To solve my problem, I just had to:
I_MPI_WAIT_MODE=1
I_MPI_PIN_DOMAIN=omp
Simple as that. Now all processors are available to all ranks.
The option
I_MPI_DEBUG=4
shows which processors each rank gets.