Referring to the mpi page, we will use openmpi/4.1.4
and mvapich2/2.3.7
MPI modules, and the same form for mpirun/mpiexec
. Referring to the python page, we will create a conda environment to match each MPI variant.
Mixing modules
and conda
can be a little tricky as both try to take over the current environment, most importantly $PATH
. It sets the search path for executables such as python
or mpiexec
. Both conda
and module load
will put their new search path in the front of $PATH
, so in general the last conda environment loaded or module loaded will control the $PATH
.
Here we will create two conda
environments for mpi4py
in both openmpi
and mvapich2
variations. If you created the environment, you can go ahead and add the other python
modules that you may need (program collections for conda or pip not to confused with lmod/environment modules
in HPC and its module
commands).
HPCC supplies a base miniconda installation for each python release, at this writing, 3.10, each to be followed by a corresponding source
command to replace the changes that conda
will try to make to your ~/.bashrc
, as changing your ~/.bashrc
is a bad idea in HPC.
We will create a new conda environment and add the python module mpich
(the predecessor of mvapich and Intel MPI) to it.
module purge module load gcc/11.2.1 mkl/19.0.5 python/3.10-anaconda source /share/apps/bin/conda-3.10.sh conda create -n mpi4py-mvapich-3.10 #use a different name if you make a new one conda activate mpi4py-mvapich-3.10 conda install mpich
Oddly the conda mpich
does not install a working mpi implementation but openmpi
(shown below) does. In either case we want to override that with the locally installed version optimized for Infiniband. So we will use module
to push the local installation of either mvapich/openmpi
to the top of the $PATH
, and also load the cuda
module if we happen to be doing a cuda
build on a GPU node. If you intend to use your python programs with a GPU, use srun
to build on a GPU node with the cuda
module. With non-GPU nodes, the cuda
module will be ignored.
Then we will download and install the latest mpi4py
from git to make sure it is recompiled with the optimized MPI of either variant.
module load mvapich2/2.3.7 cuda/11.7 git clone https://github.com/mpi4py/mpi4py cd mpi4py python setup.py install cd ..
The installation of mpi4py
is complete and we can now test under slurm, and we will repeat the commands we need at runtime, substituting the name of your environment if you created one. You can also run this as-is without any installation with the name shown. To test under slurm we'll try 2 nodes and 4 MPI tasks per node as in mpi. We'll also need other srun/#SBATCH
statements such as partition and time, but they don't affect MPI directly.
#SBATCH --nodes=2 #SBATCH --tasks-per-node=1 #SBATCH --cpus-per-task=1 hostfile=/scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} module purge module load gcc/11.2.1 mkl/19.0.5 python/3.10-anaconda source /share/apps/bin/conda-3.10.sh conda activate mpi4py-mvapich-3.10 module load mvapich2/2.3.7 cuda/11.7 mpiexec -ppn 4 -hostfile $hostfile python test_mpi.py
The program 'test_mpi.py' is
from mpi4py import MPI import sys size = MPI.COMM_WORLD.Get_size() rank = MPI.COMM_WORLD.Get_rank() name = MPI.Get_processor_name() sys.stdout.write("Hello, World! I am process %d of %d on %s.\n" % (rank, size, name))
and the output, with node names from your $hostfile
, should be:
Hello, World! I am process 5 of 8 on c1716. Hello, World! I am process 6 of 8 on c1716. Hello, World! I am process 4 of 8 on c1716. Hello, World! I am process 7 of 8 on c1716. Hello, World! I am process 1 of 8 on c1715. Hello, World! I am process 0 of 8 on c1715. Hello, World! I am process 3 of 8 on c1715. Hello, World! I am process 2 of 8 on c1715.
Using openmpi
you can follow the same process while substituting “openmpi” for mvapich2 or mpich, and its current version 4.1.4 for 2.3.7.
module purge module load gcc/11.2.1 mkl/19.0.5 python/3.10-anaconda source /share/apps/bin/conda-3.10.sh conda create -n mpi4py-openmpi-3.10 #use a different name if you make a new one conda activate mpi4py-openmpi-3.10 conda install openmpi module load openmpi/4.1.4 cuda/11.7 git clone https://github.com/mpi4py/mpi4py cd mpi4py python setup.py install cd ..
and at runtime we'll also need the openmpi form of mpiexec
, replacing the compiled program my_MPI_executable
from mpi with python test_mpi.py
:
#SBATCH --nodes=2 #SBATCH --tasks-per-node=1 #SBATCH --cpus-per-task=1 hostfile=/scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} module load gcc/11.2.1 mkl/19.0.5 python/3.10-anaconda source /share/apps/bin/conda-3.10.sh conda activate mpi4py-openmpi-3.10 module load openmpi/4.1.4 cuda/11.7 mpiexec -np 8 --map-by node -hostfile $hostfile -x PATH -x LD_LIBRARY_PATH python test_mpi.py
and the output should be similar to mvapich.