I have a school project to do matrix multiplication on a hpc distributed system.
I need to read in a matrix from a parallel IO system and use pblacs to perform the matrix multiplication in parallel on many compute nodes(processors). The data must be read in using MPI IO commands. I know PBlacs uses block cyclic distributions to perform the multiplication.
The professor has not given us much info on MPI IO, and I am having trouble finding much information/resources on it. Specifically, are there ways to read in a matrix from a parallel io system in a block cyclic manner and easily plug that into pblacs pdgemm?
Any pointers to useful resources would be much appreciated. I am a bit short on time, and getting frustrated with the lack of direction on this project.
This is actually relatively straightforward to do (if you already know something about blacs/scalapack and mpi-io!) but even then the documentation – even online – is as you’ve discovered, somewhat poor.
The first thing to know about MPI-IO is that it lets you use normal MPI data types to specify each process’ “view” of the file, and then read only the data that falls into that view. At our centre we have slides and source code for a half-day course on parallel IO; the first third or so is about the basics of MPI-IO. There are slides here and sample source code here.
The second thing to know is that MPI has a built-in way to create “distributed array” data types, one combination of which lets you lay out a block-cyclic data distribution; that’s discussed in general terms in my answer to this question: What is the difference between darray and subarray in mpi? .
So that means if you have a binary file containing a big matrix, you can read it in with mpi-io using
MPI_Type_create_darrayand it’ll be distributed by tasks in a block-cyclic way. Then it’s just a matter of doing the blacs or scalapack call. An example program of using the scalapack psgemv for matrix-vector multiplication rather than psgemm is listedin my answer to a question on the Computational Science stack exchange.
To give you an idea of how the pieces fit together, the following is a simple program which reads in a binary file containing a matrix (first the size of the square matrix N and then the N^2 elements) and then calculates the eigenvalues and vectors using scalapack’s (new)
pssyevrroutine. It combines the MPI-IO, darray, and scalapack stuff. It’s in Fortran, but the function calls are the same in C-based languages.