Compile your code using
mpicc. Then you can launch it with an mpiexec in batch mode using a batch script:
#!/bin/bash #SBATCH --job-name=picalc # Job name #SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --email@example.com # Where to send mail #SBATCH --ntasks=48 #SBATCH --nodes=2 #Number of nodes #SBATCH --tasks-per-node=24 #SBATCH --output=mpi_test_%j.out # Standard output and error log mpiexec ./a.out date
We have MPICH2 support cluster-wide. MPICH is compiled in two different ways:
/opt/mpich2/you will find the MPICH2 subtree compiled with --pm=hydra. This version uses the hydra process manager, which is the default PM at the moment. This version supports slurm natively. Please refer to this page (section "MPICH with MPIEXEC"). Have a look at this other page for an overview of the hydra process manager
/opt/mpich2-slurm/you will find the MPICH2 subtree compiled with --with-pmi=slurm --with-pm=none. Linking against this version allows you to use slurm srun to schedule your tasks. Please refer to this page.
MPICH2 3.2 hydra PM¶
MPICH2 ships with the Hydra Process Manager. This process manager interacts natively with SLURM. You can load this environment before compiling your code to link against the correct libraries:
module load mpi/mpich2
A drawback of using the native Hydra PM is that processes must be executed interactively (version 3.2): you should allocate resources, execute your program and then release resources, all manually.
To launch your binary you can create an allocation with SLURM and then execute. From a bash shell you should call
sallocwith the desired reservations (e.g.: number of nodes, number of tasks, number of tasks per node).
salloc -N 2
With the above command we asked for 2 nodes, each with one task going. After having created the allocation start your job:
mpiexec -localhost frontend ./a.out
-localhost frontend" is mandatory to have MPI processes talking to each other. The "
a.out" is your program binary. Once your job has completed, you should release resources: a call to
exit from the command line should work. Please, refer carefully to the
salloc documentation here
A better syntax is the following one-liner, which shows some additional options too:
salloc --ntasks=48 --tasks-per-node=24 --nodes=2 mpiexec -localhost frontend ./a.out
This last syntax will release resources once your task has completed.
A full dump of the bash interface of a real command submission:
mmiralto@kraken:~/pi-calc-mpi$ salloc -N 5 salloc: Granted job allocation 415 mmiralto@kraken:~/pi-calc-mpi$ mpiexec -localhost frontend ./a.out This is my sum: 0.6283185316069759 from rank: 0 name: node4 This is my sum: 0.6283185311622378 from rank: 1 name: node5 This is my sum: 0.6283185302735453 from rank: 3 name: node7 This is my sum: 0.6283185298290889 from rank: 4 name: node9 This is my sum: 0.6283185307178860 from rank: 2 name: node6 Pi is approximately 3.1415926535897341, Error is 0.0000000000000591 Time of calculating PI is: 2.461559 mmiralto@kraken:~/pi-calc-mpi$ exit salloc: Relinquishing job allocation 415 salloc: Job allocation 415 has been revoked.
MPICH2 with slurm PM¶
The documented Hydra PM can be disabled and slurm itself can be used as a process manager. This allows you to use
srun to run your jobs, but it can introduce some overhead in process intercommunication and thus in performances.
To compile your code against these libraries, issue the following command in your shell:
module load mpi/mpich2-slurm
Once you have your compiled with mpicc, you can write a slurm batch script to execute it:
#!/bin/bash #SBATCH --job-name=picalc # Job name #SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL) #SBATCH --firstname.lastname@example.org # Where to send mail #SBATCH --ntasks=48 #SBATCH --nodes=2 #Number of nodes #SBATCH --tasks-per-node=24 #SBATCH --output=mpi_test_%j.out # Standard output and error log # output some generic informations pwd; hostname; date echo "Running example mpich2 binary. Using $SLURM_JOB_NUM_NODES nodes with $SLURM_NTASKS tasks, each with $SLURM_CPUS_PER_TASK cores." # eventually purge your environment and load the one(s) you need #module purge; module load mpi/mpich2-slurm module load mpi/mpich2-slurm # now launch your mpi app srun ./a.out