What I am trying to acheive in this simplified code is:
- 2 types of processes (root, and children, ids/rank = 10 and 0-9 respectively)
- init:
- root will listen to children “completed”
- children will listen to root notification when all has completed
- while there is no winner (not all done yet):
- children will have 20% chance they will be done (and notify root they are done)
- root will check that all are done
- if all done: send notification to children of “winner”
I have code like:
int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
for (int half = 0; half < 1; half++) {
for (int round = 0; round < 1; round++) {
if (id == 10) { // root
// keeps track of who has "completed"
fill_n(arr, 10, -1);
for (int i = 0; i < 10; i++) {
MPI_Irecv(&arr[i], 1, MPI_INT, i, 0, MPI_COMM_WORLD, &reqs[i]);
}
} else if (id < 10) { // children
// listen to root of winner notification/indication to stop
MPI_Irecv(&winner, 1, MPI_INT, 10, 1, MPI_COMM_WORLD, &winnerNotification);
}
while (winner == -1) {
//cout << id << " is in loop" << endl;
if (id < 10 && !stop && ((rand() % 10) + 1) < 3) {
// children has 20% chance to stop (finish work)
MPI_Send(&id, 1, MPI_INT, 10, 0, MPI_COMM_WORLD);
cout << id << " sending to root" << endl;
stop = true;
} else if (id == 10) {
// root checks number of children completed
int numDone = 0;
for (int i = 0; i < 10; i++) {
if (arr[i] >= 0) {
//cout << "root knows that " << i << " has completed" << endl;
numDone++;
}
}
cout << "numDone = " << numDone << endl;
// if all done, send notification to players to stop
if (numDone == 10) {
winner = 1;
for (int i = 0; i < 10; i++) {
MPI_Send(&winner, 1, MPI_INT, i, 1, MPI_COMM_WORLD);
}
cout << "root sent notification of winner" << endl;
}
}
}
}
}
MPI_Finalize();
Output from debugging couts look like: problem seems to be root is not receiving all childrens notification that they are completed?
2 sending to root
3 sending to root
0 sending to root
4 sending to root
1 sending to root
8 sending to root
9 sending to root
numDone = 1
numDone = 1
... // many numDone = 1, but why 1 only?
7 sending to root
...
I thought perhaps I can’t receive into an array: but I tried
if (id == 1) {
int x = 60;
MPI_Send(&x, 1, MPI_INT, 0, 0, MPI_COMM_WORLD);
} else if (id == 0) {
MPI_Recv(&arr[1], 1, MPI_INT, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
cout << id << " recieved " << arr[1] << endl;
}
Which works.
UPDATE
This seems to be resolved if I add a MPI_Barrier(MPI_COMM_WORLD) before the end of the while loop, but why? Even if the processes run out of sync, eventually, children will send to root that they have completed and root should “listen” to that and process accordingly? What seems to be happening is root keeps running, hogging up all resources for children to execute at all? Or whats happening here?
UPDATE 2: some children not getting notification from root
Ok now the problem that root does not receive children’s notification that they have completed by @MichaelSh’s answer, I focus on children not receiving from parent. Here’s a code that reproduces that problem:
int numprocs, id, arr[10], winner = -1;
bool stop = false;
MPI_Request reqs[10], winnerNotification;
MPI_Init(NULL, NULL);
MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
MPI_Comm_rank(MPI_COMM_WORLD, &id);
srand(time(NULL) + id);
if (id < 10) {
MPI_Irecv(&winner, 1, MPI_INT, 10, 0, MPI_COMM_WORLD, &winnerNotification);
}
MPI_Barrier(MPI_COMM_WORLD);
while (winner == -1) {
cout << id << " is in loop ..." << endl;
if (id == 10) {
if (((rand() % 10) + 1) < 2) {
winner = 2;
for (int i = 0; i < 10; i++) {
MPI_Send(&winner, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
}
cout << "winner notifications sent" << endl;
}
}
}
cout << id << " b4 MPI_Finalize. winner is " << winner << endl;
MPI_Finalize();
Output looks like:
# 1 run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
9 b4 MPI_Finalize. winner is 2
0 b4 MPI_Finalize. winner is 2
# another run
winner notifications sent
10 b4 MPI_Finalize. winner is 2
8 b4 MPI_Finalize. winner is 2
Notice some processes doesnt seem to get the notification from the parent? Why is that, MPI_Wait for child processes will just hang them? So how do I resolve this?
Also
All
MPI_Barrierdoes in your case — it waits for child responses to complete. Please check my answer for a better solution
If I dont do this, I suppose each child response will just take few ms? So even if I dont wait/barrier, I’d expect the receive to still happen soon after the send? Unless processes end up hogging resources and other processes does not run?
Please try this block of code (error checking omitted for simplicity):
Edit A better solution:
Each child initiates root winner notification receipt and sends its notification to the root.
Root initiates winner notification receipt to the array and goes into wait for all notifications to be received, and then sends winner’s id to children.
Insert this code below after
for (int round = 0; round < 1; round++)