I’m trying to write an MPI version of a program that runs an odd/even compare-split operation on n randomly generated elements.
Process 0 should generated the elements and send nlocal of them to the other processes, (keeping the first nlocal for itself). From here, process 0 should print out it’s results after running the CompareSplit algorithm. Then, receive the results from the other processes run of the algorithm. Finally, print out the results that it has just received.
I have a large chunk of this already done, but I’m getting a deadlock that I can’t seem to fix. I would greatly appreciate any hints that people could give me.
Here is my code http://pastie.org/3742474
Right now I’m pretty sure that the deadlock is coming from the Send/Recv at lines 134 and 151. I’ve tried changing the Send to use “tag” instead of myrank for the tag parameter..but when I did that I just keep getting a “MPI_ERR_TAG: invalid tag” for some reason.
Obviously I would also run the algorithm within the processors > 0 but I took that part out for now, until I figure out what is going wrong.
Any help is appreciated.
EDIT: I’ve written a smaller test case, that doesn’t contain any CompareSplit operations, but is still deadlocking. http://pastie.org/3744691
I fixed the above test case by changing the tag at line 83 from “myrank” to “tag”.
Well the test case works, but when the actual algorithm is added in like in my program it deadlocks..
So, I think I’ve narrowed the deadlock down to this chunk of code. It looks to be the Sendrecv under the else.
for (i = 1; i <= npes; i++) {
if (i % 2 == 1) // odd phase
MPI_Sendrecv(elmnts, nlocal, MPI_INT, oddrank, 1, relmnts,
nlocal, MPI_INT, oddrank, 1, MPI_COMM_WORLD, &status);
else
MPI_Sendrecv(elmnts, nlocal, MPI_INT, evenrank, 1, relmnts,
nlocal, MPI_INT, evenrank, 1, MPI_COMM_WORLD, &status);
CompareSplit(nlocal, elmnts, relmnts, wspace,
myrank < status.MPI_SOURCE);
}
The tag error was because tags have to be positive integers in the range from 1 to some implementation-dependant maximum which is guaranteed to be at least 32k.
The deadlock is pretty easy to understand; look at what the non-rank zero processes are doing:
So they’re doing one receive, and one send back. But processor 0 is doing much more than this; it sends everyone their data, then executes a bunch of send-receives to process 1 (evenrank) and MPI_NULL_PROC (oddrank). But the send-receives to evenrank are noops, and the the send-recieves to process 1 will never be answered because process 1 isn’t doing the same thing.
I think you need to move that part of the algorithm outside of the
if (rank == 0)test.