User Tools

Site Tools


slurm_scripts

Slurm Commands and Scripts

Basic slurm commands are:

slurmuse
sbatch submit <job file>
srun submit interactive job
squeue list all queued jobs
squeue -u rfeynman list queued jobs for user rfeynman
scancel cancel <job#>
sinfo node status;list of queues

A Torque compatibility layer also offers some torque commands such as qstat and qsub. A basic script in slurm looks like:

#!/bin/bash
#SBATCH --job-name=mpi
#SBATCH --output=zzz.slurm
#SBATCH --partition comp06
#SBATCH --nodes=2
#SBATCH --tasks-per-node=32
#SBATCH --time=6:00:00
cd $SLURM_SUBMIT_DIR
module purge
module load intel/18.0.1 impi/18.0.1 mkl/18.0.1
mpirun -np $SLURM_NTASKS -machinefile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} ./mympiexe -inputfile MA4um.mph -outputfile MA4um-output.mph

and a more complex script with file moving looks like:

#!/bin/bash
#SBATCH --job-name=espresso
#SBATCH --output=zzz.slurm
#SBATCH --nodes=4
#SBATCH --tasks-per-node=32
$SBATCH --time=00:00:10
#SBATCH --partition comp06
module purge
module load intel/14.0.3 mkl/14.0.3 fftw/3.3.6 impi/5.1.2
cd $SLURM_SUBMIT_DIR
cp *.in *UPF /scratch/$SLURM_JOB_ID
cd /scratch/$SLURM_JOB_ID
mpirun -ppn 16 -hostfile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} -genv OMP_NUM_THREADS 2 \ /share/apps/espresso/qe-6.1-intel-mkl-impi/bin/pw.x -npools 1 <ausurf.in
mv ausurf.log *mix* *wfc* *igk* $SLURM_SUBMIT_DIR/
pinnacle-l1:$

See also [ https://www.marquette.edu/high-performance-computing/pbs-to-slurm.php ] [ https://hpc.nih.gov/docs/pbs2slurm.html ]

We have a conversion script /share/apps/bin/pbs2slurm.sh which should do 95% of the script conversion from old PBS scripts to SLURM scripts. Please report errors by the script so we can improve it. Normally it should be in your path and

pbs2slurm.sh <pbs-script-name>

will generate the conversion to stdout, thus save with

pbs2slurm.sh demoscriptpbs.sh > demoscriptslurm.sh
Notes:

Leading hash-bang /bin/sh or /bin/bash or /bin/tcsh is optional in torque, required in slurm, pbs2slurm.sh inserts it if not present

Slurm date formats with days are “2-00:00:00” not “2:00:00:00” like Torque. If invalid sbatch will use the partition default and srun will kick the job back.

Slurm unlike Torque does not autogenerate an MPI machinefile/hostfile, so the job creates

/scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID}

The generated machinefile differs from torque machinefile in that it has 1 entry per host instead of ncores entry per host. Slurm does define a variable with the total number of cores $SLURM_NTASKS, good for most MPI jobs that use every core.

slurm_scripts.txt · Last modified: 2022/05/02 17:03 by root