User Tools

Site Tools


modules

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
modules [2015/11/11 17:41]
root
modules [2017/09/11 23:14]
wlfarris
Line 1: Line 1:
-===== Environment Modules ​and .bashrc ​=====+==== Environment Modules.bashrc ====
  
 The [[http://​modules.sourceforge.net/​|Modules]] package is supplied on the system to set up the users'​s environment variables to run a choice of the needed programs and versions. The most important of these variables are ''​$PATH'',​ telling the system where to find executable files such as ''​matlab''​ or ''​mpirun'',​ and ''​$LD_LIBRARY_PATH'',​ telling the system where to find shared libraries that an executable calls. ​ You can manipulate the environment variables yourself instead of calling modules, but there is no advantage in doing so. The [[http://​modules.sourceforge.net/​|Modules]] package is supplied on the system to set up the users'​s environment variables to run a choice of the needed programs and versions. The most important of these variables are ''​$PATH'',​ telling the system where to find executable files such as ''​matlab''​ or ''​mpirun'',​ and ''​$LD_LIBRARY_PATH'',​ telling the system where to find shared libraries that an executable calls. ​ You can manipulate the environment variables yourself instead of calling modules, but there is no advantage in doing so.
Line 14: Line 14:
 gcc/​4.7.2 ​    ​impi/​5.0.0 ​   intel/​14.0.3 ​ module-git ​   modules ​      ​null ​         openmpi/​1.8.8 use.own gcc/​4.7.2 ​    ​impi/​5.0.0 ​   intel/​14.0.3 ​ module-git ​   modules ​      ​null ​         openmpi/​1.8.8 use.own
 $ module purge $ module purge
-$ module load intel/​14.0.3 ​mvapich2/2.1+$ module load intel/​14.0.3 ​impi/5.1.1
 $ module list $ module list
 Currently Loaded Modulefiles:​ Currently Loaded Modulefiles:​
-  1) intel/​14.0.3 ​  2) impi/5.0.0+  1) intel/​14.0.3 ​  2) impi/5.1.1
 $ module purge $ module purge
 $ module load intel/​14.0.3 mkl/14.0.3 mvapich2/​2.1 $ module load intel/​14.0.3 mkl/14.0.3 mvapich2/​2.1
Line 25: Line 25:
 </​code>​ </​code>​
  
-''​.bashrc''​ is a file in your home directory that also sets up your user environment,​ either by ''​module ​load ...'' ​or directly setting environment variables. Here is a basic template ''​.bashrc''​ used to set up new accounts If you don't want to use these modulescomment out the ''​module ​load ...''​ statements in ''​.bashrc''​ and relogin. If a ''​module ​load ...'' ​is issued with a program but no version as shown below, the system-defined default version of that program module is used.  ​''​module ​list''​ will show the actual versions loaded.+''​module ​avail'' ​can be slow to loadA quicker way to see the top-level names only is (on either clusterthis is razor) 
 +<​code>​ 
 +$ ls /​share/​apps/​modulefiles 
 +abinit ​   cdhit      flash        idb        mkl          netcdf ​   pindel ​       reapr        tophat 
 +abyss     ​changa ​    ​gadget ​      ​ifort ​     ​module-cvs   ​null ​     platform_mpi ​ samtools ​    ​transdecoder 
 +allpaths ​ cmake      gamess ​      ​impi ​      module-info  nwchem ​   pqs           ​scipy ​       trilinos 
 +augustus ​ cplex      gcc          infernal ​  ​modules ​     open64 ​   proj.4        sickle ​      ​trinity 
 +bbmap     ​crb-blast ​ gdal         ​intel ​     mono         ​opencv ​   python ​       siesta ​      ​usearch 
 +bib       ​cuda ​      ​GenomeVISTA ​ java       ​moose ​       openfoam ​ qiime         ​soapdenovo2 ​ use.own 
 +blast     ​cufflinks ​ glpk         ​last ​      ​mothur ​      ​openmpi ​  ​qlogicmpi ​    ​spades ​      ​velvet 
 +blat      dDocent ​   gotoblas2 ​   LIGGGHTS ​  ​mpiblast ​    ​orca ​     qmcpack ​      ​sra-tools ​   ViennaRNA 
 +boost     ​deepTools ​ grass        maker      mpt          os        quast         ​sunstudio ​   visit 
 +bowtie ​   dot        gromacs ​     matlab ​    ​muscle ​      ​parallel ​ R             ​swat ​        ​wise2 
 +bowtie2 ​  ​emboss ​    ​gurobi ​      ​mcl ​       mvapich ​     pear      randfold ​     swig 
 +busco     ​fastqc ​    ​hdf5 ​        ​migrate-n ​ mvapich2 ​    ​perl ​     raxml         ​symphony 
 +bwa       ​fftw ​      ​hmmer ​       miRDeep ​   ncl          PGI       ​rDock ​        ​tcltk 
 +</​code>​ 
 +''​ls -R /​share/​apps/​modulefiles | more'' ​gives a full list of every file.
  
 +The default .bashrc loaded with a new account includes three ''​module''​ loads.
 <​code>​ <​code>​
 $ cat ~/.bashrc $ cat ~/.bashrc
Line 46: Line 64:
 </​code>​ </​code>​
  
-''​.bashrc''​ is sourced at the beginning of each interactive job. There is a similar file ''​.bash_profile''​ sourced at the beginning of each non-interactive job.  In our setup, we source ''​.bashrc''​ from ''​.bash_profile''​ so that the files are effectively the same, thus reducing the maintenance effort. ​   Interactive or batch is determined in ''​.bashrc''​ by ''​[ -z "​$PS1"​ ] && return''​ which drops out of the loop on batch runs, so commands following that are for interactive sessions only, like setting the value of the prompt ''​$PS1''​. ​ Commands towards the top of ``.bashrc`` are for both interactive and batch.+''​.bashrc''​ is sourced at the beginning of each interactive job. There is a similar file ''​.bash_profile''​ sourced at the beginning of each non-interactive job.  In our setup, we source ''​.bashrc''​ from ''​.bash_profile''​ so that the files are effectively the same, thus reducing the maintenance effort. ​   Interactive or batch is determined in ''​.bashrc''​ by ''​[ -z "​$PS1"​ ] && return''​ which drops out of the loop on batch runs, so commands following that are for interactive sessions only, like setting the value of the prompt ''​$PS1''​. ​ Commands towards the top of ``.bashrc`` are for both interactive and batch.  You can add commands to ''​.bash_profile''​ for batch jobs only. 
 + 
 +For csh-users, module commands should operate identically under ''​tcsh''​ but they are untested.
  
 Here are some recommended module/​.bashrc setups for different cases: Here are some recommended module/​.bashrc setups for different cases:
  
-** Your run only one program, or all the programs you run use the same modules, or use different modules that don't conflict **+** You run only one program, or all the programs you run use the same modules, or each uses different modules that don't conflict **
  
-Put the ''​module load ...''​ in ''​.bashrc'',​ above ''​[ -z...''​. ​ The same environment will be loaded from ''​.bashrc''​ for every interactive session, batch job, and MPI program if any.  Many modules, for instance ''​R''​ and ''​matlab''​ and ''​python'',​ can be assumed to not conflict, though many combinations have not been tested. ​ Modules that definitely do conflict are MPI modules, only one may be safely used at a time, and multiple versions of the same program, like ''​gcc/​4.7.1''​ and ''​gcc/​4.9.1''​.+Put the ''​module load ...''​ in ''​.bashrc'',​ above ''​[ -z...''​. ​ The same environment will be loaded from ''​.bashrc''​ for every interactive session, batch job, and MPI program if any.  Many modules, for instance ''​R''​ and ''​matlab''​ and ''​python'',​ can be assumed to not conflict, though ​most of the very many combinations have not been tested. ​ Modules that definitely do conflict are MPI modules, only one may be safely used at a time, and multiple versions of the same program, like ''​gcc/​4.7.1''​ and ''​gcc/​4.9.1''​.  Multiple compilers, such as ''​gcc''​ and ''​intel''​ usually don't conflict, but for MVAPICH2 and OpenMPI, the compiler module is used to load the MPI module, so multiple compiler modules should be loaded in order (1) compiler module you want to use with MPI (2) MPI module (3) additional compiler module. ​ In some cases, gnu MPI programs using MKL libraries can want a library only available from the Intel compiler, so a combination like ''​module load gcc/4.7.2 mkl/14.0.3 openmpi/​1.8.8;​module load intel/​14.0.3''​ may be necessary. ​ The last ''​intel/​14.0.3''​ may show a harmless warning message.
  
-** You use different modules for different programs, but only run single-node batch jobs **+** You use different ​(conflicting) ​modules for different programs, but only run single-node batch jobs **
  
-If you put ''​module load ...''​ in your ''​.bashrc'',​ every interactive session and batch and MPI thread will load that, which won't be good if it's the wrong MPI version. Delete or comment out (add leading "#"​ to) the ''​module load ...''​ in ''​.bashrc''​. ​ Put the relevant ''​module load ...''​ in your batch scripts after the ''#​PBS''​ statements, for example:+If you put ''​module load ...''​ in your ''​.bashrc'',​ every interactive session and batch and MPI thread will load that, which won't be good if it'​s ​a slave compute nodes loading ​the wrong MPI version. Delete or comment out (add leading "#"​ to) the ''​module load ...''​ in ''​.bashrc''​. ​ Put the relevant ''​module load ...''​ in your batch scripts after the ''#​PBS''​ statements, for example:
  
 <​code>​ <​code>​
 #PBS ... #PBS ...
 #PBS -l node=1:​ppn=12 #PBS -l node=1:​ppn=12
 +module purge
 module load R openmpi module load R openmpi
 cd $PBS_O_WORKDIR cd $PBS_O_WORKDIR
Line 68: Line 89:
 In this case, for interactive sessions such as compiling, type the relevant ''​module load ...''​ in your session before work. In this case, for interactive sessions such as compiling, type the relevant ''​module load ...''​ in your session before work.
  
-** You use different modules for different programs, and also run multi-node jobs ** +** You use different ​(conflicting) ​modules for different programs, and also run multi-node ​MPI jobs **
- +
-This is the most difficult case and also common. ​ The first two solutions won't always work.  If a module is set in a batch script using multiple nodes, it definitely applies to the MPI threads running in the first or  "​master"​ compute node (always the lowest numbered assigned node in our batch configuration) but does not necessarily apply to the "​slave"​ compute nodes, depending how different MPI versions issue remote threads. ​ Multiple nodes imply MPI is being used, and the solution varies by MPI type: +
- +
-==Intel MPI == +
- +
-If you set the ``impi`` module at the top of your batch file, paths will be passed to slave nodes and a multiple-node job will run.  If you want to set other environment variables, they won't be passed to slave nodes and must be set either (a) in .bashrc or (b) set in the ''​mpirun''​ statement, for instance the number of MKL threads below. Notice no equals sign in the assignment statement. +
- +
-<​code>​ +
-#PBS ... +
-#PBS -l node=2:​ppn=12 +
-module load intel/​14.0.3 impi/​5.0.0 +
-cd $PBS_O_WORKDIR +
-mpirun -np 6 -machinefile $PBS_NODEFILE -genv MKL_NUM_THREADS 4  ./xhpl >​logfile +
-</​code>​ +
- +
-== MVAPICH2 == +
- +
-MVAPICH2 is similar to Intel MPI in that program paths will correctly pass to slave nodes, but other environment variables won'​t. ​ Syntax varies by one letter and also no equals sign. +
- +
-<​code>​ +
-#PBS ... +
-#PBS -l node=2:​ppn=12 +
-module load intel/​14.0.3 mvapich2/​2.1 +
-cd $PBS_O_WORKDIR +
-mpirun -np 6 -machinefile $PBS_NODEFILE -env MKL_NUM_THREADS 4  ./xhpl >​logfile +
-</​code>​ +
- +
-== Open MPI == +
- +
-Open MPI in the latest versions will correctly pass program paths except ''​$LD_LIBRARY_PATH''​ to slave compute nodes, so the ''​-x LD_LIBRARY_PATH''​ is required, and an optional variable ''​MKL_NUM_THREADS''​ is also set.  Again syntax is slightly different, no equals and no value means pass the existing value, and =value means set and pass the value. ​ This only works for Open MPI version 1.8.8 +.  Earlier versions need the module set in ''​.bashrc''​. ​ Recent Open MPI versions are much faster, so programs using early versions should be recompiled anyway. +
- +
-<​code>​ +
-#PBS ... +
-#PBS -l node=2:​ppn=12 +
-module load intel/​14.0.3 openmpi/​1.8.8 +
-cd $PBS_O_WORKDIR +
-mpirun -np 6 -machinefile $PBS_NODEFILE -x LD_LIBRARY_PATH -x MKL_NUM_THREADS=4 ​ ./xhpl >​logfile +
-</​code>​+
  
 +This is a more difficult case.  The first two solutions won't always work.  If a module is set in a batch script using multiple nodes, the module definitely applies to the MPI processes running in the first or  "​master"​ compute node (usually the first and lowest numbered assigned node in our batch configuration) but does not necessarily apply to the "​slave"​ compute nodes, depending how different MPI types issue remote threads. ​ Multiple nodes imply MPI is being used, and the solution varies by MPI type. A certain form of the ''​mpirun''​ statement for each MPI type is required. See the [[MPI|MPI]] article for more details.
modules.txt · Last modified: 2017/09/11 23:14 by wlfarris