In this coming semester, I am starting some research on large-scale distributed computing with MPI. What I am looking for help with is the initial stages, specifically getting a solid development environment set up. Does anyone have any recommendations for good tools to use for this?
I am also curious as to whether there exists a kind of simulator that would allow be to write MPI and distribute it to virtual (rather than physical) nodes.
You could download a MPI library such as Open-MPI, MPICH, etc. and run it on a multi-core system (such as a recent desktop) with number of processes = number of cores. They would operate without a network interconnect (for instance, over shared memory). That should be enough to explore initially.
If you really want multiple nodes, you can experiment with multiple VMs with a VM network before actually moving on to a physical cluster. One of the VMs would have to be configured to act like a NFS server and the rest of the VMs could mount your home directories over NFS.