User Tools

Site Tools


torque_slurm_scripts

PBS and Slurm Commands and Scripts

Basic torque/slurm commands are:

torqueslurmuse
qsubsbatch submit <job file>
qsub -I srun submit interactive job
qstat squeue list all queued jobs
qstat -u rfeynman squeue -u rfeynman list queued jobs for user rfeynman
qdel scancel cancel <job#>
shownodes -l -n;qstat -q sinfo node status;list of queues

A basic script in torque and slurm looks like:

#PBS -N mpi
#PBS -j oe
#PBS -o zzz.$PBS_JOBID
#PBS -l nodes=2:ppn=32,walltime=06:00:00
#PBS -q q06h32c
cd $PBS_O_WORKDIR
NP=$(wc<$PBS_NODEFILE)
module purge
module load intel/18.0.1 impi/18.0.1 mkl/18.0.1
mpirun -np $NP -machinefile $PBS_NODEFILE ./mympiexe -inputfile MA4um.mph -outputfile MA4um-output.mph
#!/bin/bash
#SBATCH --job-name=mpi
#SBATCH --output=zzz.slurm
#SBATCH --partition comp06
#SBATCH --nodes=2
#SBATCH --tasks-per-node=32
#SBATCH --time=6:00:00
cd $SLURM_SUBMIT_DIR
module purge
module load intel/18.0.1 impi/18.0.1 mkl/18.0.1
mpirun -np $SLURM_NTASKS -machinefile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} ./mympiexe -inputfile MA4um.mph -outputfile MA4um-output.mph

and a more complex script with file moving looks like:

#PBS -N espresso
#PBS -j oe
#PBS -o zzz.$PBS_JOBID
#PBS -l nodes=4:ppn=32,walltime=00:00:10
#PBS -q q06h32c
module purge
module load intel/14.0.3 mkl/14.0.3 fftw/3.3.6 impi/5.1.2
cd $PBS_O_WORKDIR
cp *.in *UPF /scratch/$PBS_JOBID
cd /scratch/$PBS_JOBID
sort -u $PBS_NODEFILE >hostfile
mpirun -ppn 16 -hostfile hostfile -genv OMP_NUM_THREADS 4 -genv MKL_NUM_THREADS 4 /share/apps/espresso/qe-6.1-intel-mkl-impi/bin/pw.x -npools 1 <ausurf.in
mv ausurf.log *mix* *wfc* *igk* $PBS_O_WORKDIR/
#!/bin/bash
#SBATCH --job-name=espresso
#SBATCH --output=zzz.slurm
#SBATCH --nodes=4
#SBATCH --tasks-per-node=32
$SBATCH --time=00:00:10
#SBATCH --partition comp06
module purge
module load intel/14.0.3 mkl/14.0.3 fftw/3.3.6 impi/5.1.2
cd $SLURM_SUBMIT_DIR
cp *.in *UPF /scratch/$SLURM_JOB_ID
cd /scratch/$SLURM_JOB_ID
mpirun -ppn 16 -hostfile /scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID} -genv OMP_NUM_THREADS 4 -genv MKL_NUM_THREADS 4 /share/apps/espresso/qe-6.1-intel-mkl-impi/bin/pw.x -npools 1 <ausurf.in
mv ausurf.log *mix* *wfc* *igk* $SLURM_SUBMIT_DIR/
pinnacle-l1:$

See also [ https://hprc.tamu.edu/wiki/TAMU_Supercomputing_Facility:HPRC:Batch_Translation ] [ https://slurm.schedmd.com/rosetta.pdf ] [ https://www.sdsc.edu/~hocks/FG/PBS.slurm.html ]

We have a conversion script /share/apps/bin/pbs2slurm.sh which should do 95% of the script conversion. Please report errors by the script so we can improve it. Normally it should be in your path and

pbs2slurm.sh <pbs-script-name>

will generate the conversion to stdout, thus save with “> newscript.slurm”.

Notes:

Leading hash-bang /bin/sh or /bin/bash or /bin/tcsh is optional in torque, required in slurm, pbs2slurm.sh inserts it if not present
Slurm does not autogenerate an MPI machinefile/hostfile, so the job creates

/scratch/${SLURM_JOB_ID}/machinefile_${SLURM_JOB_ID}

The generated machinefile differs from torque machinefile in that it has 1 entry per host instead of ncores entry per host. Slurm does define a variable with the total number of cores $SLURM_NTASKS, good for most MPI jobs that use every core.

torque_slurm_scripts.txt · Last modified: 2020/01/31 20:27 by root