====Slurm sbatch/srun scripts====
Slurm jobs may be submitted by:
1. Slurm batch scripts submitted by ''sbatch''
2. PBS batch scripts submitted by ''sbatch'' or ''qsub''
3. Slurm interactive submitted by ''srun''
4. Slurm interactive and graphical submitted by [[ portal_login_new | OpenOnDemand ]]
Essential slurm subcommands and available values are described in [[ selecting_resources | Selecting Resources ]]. The same constraints apply regardless of the source of the commands.
Basic slurm commands are:
slurm, use
sbatch , submit
srun , submit interactive job
squeue , list all queued jobs
squeue -u rfeynman , list queued jobs for user rfeynman
scancel , cancel
sinfo , node status;list of queues
A basic slurm batch script for MPI (2 full nodes) follows. It should begin with "#!/bin/sh" or other shell such as "#!/bin/tcsh".
For MPI jobs of more than one node, a ''hostfile'' or ''machinefile'' is required, and is optional for single-node MPI. The machinefile below is auto-generated by slurm startup and uses the unique job number ${SLURM_JOB_ID}. The slurm variable ${SLURM_NTASKS} will be defined as nodes * tasks-per-node, here 64 at runtime. Usually unless out of memory we want tasks-per-node x cpus-per-task to equal the number of cores in a partition node (here 32) to allocate a full node, and usually MPI processes to be that number times x the number of nodes.
#!/bin/sh
#SBATCH --partition comp06
#SBATCH --qos comp
#SBATCH --nodes=2
#SBATCH --tasks-per-node=32
#SBATCH --cpus-per-task=1
#SBATCH --time=6:00:00
module purge
module load intel/18.0.1 impi/18.0.1 mkl/18.0.1
mpirun -np $SLURM_NTASKS -machinefile /scratch/${SLURM_JOB_ID}/machinefile_\
${SLURM_JOB_ID} ./mympiexe logfile
A similar interactive job with one node in the ''comp01'' partition would be:
srun --nodes 1 --ntasks-per-node=1 --cpus-per-task=32 --partition comp01 --qos comp \
--time=1:00:00 --pty /bin/bash
All the slurm options between ''srun'' and ''--pty'' are the same for ''srun'' or ''sbatch''.
Then the ''module'' and ''mpirun'' commands would be entered interactively.
A PBS compatibility layer will run simple PBS scripts under slurm. Basic PBS commands that can be interpreted as slurm commands will be translated. Commands ''qsub'' and ''qstat -u'' are available.
$ cat qcp2.sh
#!/bin/bash
#PBS -q cloud72
#PBS -l walltime=00:10:00
#PBS -l nodes=1:ppn=2
sleep 5
echo $HOSTNAME
$ qsub qcp2.sh
1430970
$ cat qcp2.sh.o1430970
c1331
$
A bioinformatics and large data example follows. When a job produces more than about 500 MB or 10,000 files of output, use the /scratch/ or /local_scratch/ directory to redirect program output and temporary files to avoid excess load on the main storage /scrfs/storage/. This job produces about 30 GB of output in 5 files.
In this script you have 1) slurm commands 2) job setup 3) go to scratch directory and create ''tmprun'' directory 4) run ''trimmomatic'' in tmprun directory 5) copy back output files and remove output files after successful copy.
If you don't understand it, do the copy back and delete manually as misapplied ''rm -rf'' can be very dangerous. But any possible damage can be limited by using a specific name such as "tmprun" that is not likely to contain important data. Don't ''rm -rf *'' unless you are very sure what you are doing.
#!/bin/bash
#SBATCH --partition tres72
#SBATCH --qos tres
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=32
#SBATCH --time=72:00:00
#
module load python/anaconda-3.7.3
source /share/apps/bin/conda-3.7.3.sh
conda activate tbprofiler
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
#
cd /scratch/$SLURM_JOB_ID
mkdir -p tmprun
cd tmprun
#
FILE=/storage/jpummil/C.horridus/SnakeNanopore/Kausttrim
trimmomatic PE -threads $SLURM_CPUS_PER_TASK ${FILE}F.fq ${FILE}R.fq \
Kausttrim-Unpaired.fq ILLUMINACLIP:TruSeq3-SE:2:30:10 HEADCROP:5 LEADING:3 \
TRAILING:3 SLIDINGWINDOW:4:28 MINLEN:65
#
cd ..
rsync -av tmprun $SLURM_SUBMIT_DIR/
if [ $? -eq 0 ]; then
rm -rf tmprun
fi