I have double A[B_ROWS][B_COLUMNS]; in C API I used stufflike:
MPI_Isend(&A[low_bound][0], (upper_bound - low_bound) * A_COLUMNS, MPI_DOUBLE, i, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &request);
and
MPI_Recv(&A[low_bound][0], (upper_bound - low_bound) * A_COLUMNS, MPI_DOUBLE, 0, MASTER_TO_SLAVE_TAG + 2, MPI_COMM_WORLD, &status);
Now with boost::mpi I try:
world.isend(i, TO_SLAVE_TAG + 2, &A[low_bound][0], (upper_bound - low_bound) * A_COLUMNS);
and
world.recv(0, TO_SLAVE_TAG + 2, &A[low_bound][0], (upper_bound - low_bound) * A_COLUMNS);
but my app constantly fails with stuff like:
rank 1 in job 10 master_39934 caused collective abort of all ranks
exit status of rank 1: killed by signal 11
which means seg fault, please note that original C app worked as needed and all that I currently changed was use of api – not any logic around.
So what is the correct way of sending 2d C style arrays over boost::mpi?
Assuming that my blind guess is right, and what you typed above is accurate, the size of
Ahas nothing to do withA_COLUMNS(instead,AhasB_COLUMNS). If so, the code below will fix that kind of “out of sync” error:the above code will, for one and two dimensional arrays, figure out how many copies of T you really want to send, instead of you having to maintain it manually.
It even works for slices, like
&A[low_bound], upper_bound-lower_bound.One thing you might want to be careful of is blowing past the end of your array. It is easily possible that your C code blew past the end of arrays, but there wasn’t anything important there, so it survived. In the C++ code, you could have an object there, and you die instead of survive.
Another approach might be to write a function that takes both upper and lower bounds for a slice, like this:
in this case, you pass an array directly, and then say where you want to start and end reading. The advantage is that I check that the array actually has the data to send, or room for the data to arrive.
(These two functions rely on the functions above)
In a distributed situation, you’d want to produce a logging mechanism for your Asserts that is descriptive.
Here is sample use of the above code:
and ditto for recv.