Slurm Commands and Scripts

Basic slurm commands are:

slurmuse
sbatch submit <job file>
srun submit interactive job
squeue list all queued jobs
squeue -u rfeynman list queued jobs for user rfeynman
scancel cancel <job#>
sinfo node status;list of queues

A basic script in slurm looks like:

#!/bin/bash
#SBATCH --job-name=mpi
#SBATCH --output=zzz.slurm
#SBATCH --partition comp06
#SBATCH --nodes=2
#SBATCH --tasks-per-node=32
#SBATCH --time=6:00:00
cd $SLURM_SUBMIT_DIR
module purge
module load intel/18.0.1 impi/18.0.1 mkl/18.0.1
mpirun -np $SLURM_NTASKS -machinefile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} ./mympiexe -inputfile MA4um.mph -outputfile MA4um-output.mph

and a more complex script with file moving looks like:

#!/bin/bash
#SBATCH --job-name=espresso
#SBATCH --output=zzz.slurm
#SBATCH --nodes=4
#SBATCH --tasks-per-node=32
$SBATCH --time=00:00:10
#SBATCH --partition comp06
module purge
module load intel/14.0.3 mkl/14.0.3 fftw/3.3.6 impi/5.1.2
cd $SLURM_SUBMIT_DIR
cp *.in *UPF /scratch/$SLURM_JOB_ID
cd /scratch/$SLURM_JOB_ID
mpirun -ppn 16 -hostfile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} -genv OMP_NUM_THREADS 4 -genv MKL_NUM_THREADS 4 /share/apps/espresso/qe-6.1-intel-mkl-impi/bin/pw.x -npools 1 <ausurf.in
mv ausurf.log *mix* *wfc* *igk* $SLURM_SUBMIT_DIR/
pinnacle-l1:$

See also [ https://hprc.tamu.edu/wiki/TAMU_Supercomputing_Facility:HPRC:Batch_Translation ] [ https://slurm.schedmd.com/rosetta.pdf ] [ https://www.sdsc.edu/~hocks/FG/PBS.slurm.html ]

We have a conversion script /share/apps/bin/pbs2slurm.sh which should do 95% of the script conversion from old PBS scripts to SLURM scripts. Please report errors by the script so we can improve it. Normally it should be in your path and

pbs2slurm.sh <pbs-script-name>

will generate the conversion to stdout, thus save with “

> newscript.slurm".

Notes:

Leading hash-bang /bin/sh or /bin/bash or /bin/tcsh is optional in torque, required in slurm, pbs2slurm.sh inserts it if not present
Slurm does not autogenerate an MPI machinefile/hostfile, so the job creates

/scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID}

The generated machinefile differs from torque machinefile in that it has 1 entry per host instead of ncores entry per host. Slurm does define a variable with the total number of cores $SLURM_NTASKS, good for most MPI jobs that use every core.