LSF provides resource management and job scheduling for the Henry2 cluster. There are two resource requirements that must be specified for every job: number of processor cores the job will use and the maximum wall clock time that the job will run. Additional resource requirements may be added, such as memory required, processor model needed, processor instruction set support required, and if a GPU is required.

A detailed LSF template is demonstrated below:

#!/bin/tcsh 
#BSUB -n X
#BSUB -W HH:MM
#BSUB -q queue_name
#BSUB -R span[span_type=Y]
#BSUB -R select[resource_type]
#BSUB -R rusage[usage_type]
#BSUB -gpu "[string_of_specifiers]"
#BSUB -m "hostname or host group"
#BSUB -o out.%J
#BSUB -e err.%J
#BSUB -J jobname
#BSUB -x
module load program_environment
mpirun program

Shell

This is the shell that will be used to call the rest of the script. The default shell for most users is tcsh. Another shell is bash. To find the current shell, type: echo $SHELL.

Example:
#!/bin/tcsh
back to script

-n Cores or Tasks

Usually this is the number of cores needed for the job. In special cases such as hybrid MPI/OpenMP or Rmpi jobs, this means the number of MPI tasks.

Example:
#BSUB -n 32
mpirun program
This will request 32 cores and will tell mpirun that there are 32 tasks, sometimes specified by mpirun -n 32. Using -n 32 after mpirun is not necessary if it matches the #BSUB -n.
back to script

-W Wall Clock Time

Set the runtime limit of the job.

Example:
#BSUB -W 2:15 
This will set the maximum run time to two hours and 15 minutes. If the job takes longer, it will be killed. Increasing -W may avoid a job from being killed prematurely, but it may result in a longer wait in the queue.
back to script

-q Queue

Choose which queue to submit the job. If not specified, LSF will choose a default queue. Let LSF choose the proper queue unless there is a specific reason not to. For more information on selecting a queue, see the FAQ.

Example:
#BSUB -q gpu 

This will submit to the gpu queue. Specifying the gpu queue will ensure the node has an attached GPU.

See LSF Resources for more information on the available queues.
back to script

-R Resource

Specify type of compute resource or details about compute configuration. Examples of different resource types will be given below.

See LSF Resources for more information on the available resources.

-R Resource - span

ptile

Specify how many cores/tasks to place on each node:
#BSUB -R span[ptile=#numcores]

Example:
#BSUB -n 16
#BSUB -R span[ptile=4]
#BSUB -x 
This requests 16 cores, with 4 cores to be placed on each node. Four nodes will be allotted. The reasons for using fewer cores than the total number of cores per node include if the job spawns threads or is memory intensive. In that case, specify -x, which ensures that no other jobs are scheduled on that node.

hosts

Specify how many nodes (hosts) to confine the cores/tasks to:
#BSUB -R span[hosts=#numhosts]

Example:
#BSUB -n 8 
#BSUB -R span[hosts=1]
This requests 8 cores and demands that all cores be placed on the same node. This is used for pure shared memory jobs, i.e., when the program cannot communicate between nodes.
back to script

-R Resource - select

Select a particular resource, including node type.
#BSUB -R select[resource_type]

Example:
#BSUB -R select[m2070]
This requests a node with an Nvidia M2070 model GPU.

Example:
#BSUB -R select[hc || oc ] 
This requests a node that has either 12 cores or 16 cores.

See LSF Resources for more information on specific resources.

back to script

-R Resource - rusage

Set a usage case, such as higher memory. Usage requests are per host.

Example:
#BSUB -n 16
#BSUB -R "rusage[mem=16000]"
This requests a single node (hosts=1) with at least 16 cores (-n 16) that has at least 16 GB of RAM.

See LSF Resources for more information on usage.

back to script

-gpu GPU specifiers

NVIDIA's multiple process service (mps) may improve GPU performance for certain types of applications, but mps is only available for newer GPU models.

To run on any of the available GPUs, the mps setting must be set to no, and the gpu queue must also be specified.

#BSUB -q gpu
#BSUB -gpu "num=1:mode=shared:mps=no"

For older GPUs (M2070, M2070Q, M2090) that do not support NVIDIA's mps, use:
#BSUB -R "select[m2070 || m2070q || m2090]"
#BSUB -q gpu
#BSUB -gpu "num=1:mode=shared:mps=no"

For newer GPUs (RTX 2080, GTX 1080, P100, K20m) that support NVIDIA's mps, use:

#BSUB -R "select[rtx2080 || gtx1080 || p100 || k20m]"
#BSUB -q gpu
#BSUB -gpu "num=1:mode=shared:mps=yes"

back to script

-m Hostname

Select a particular host or host group to run on. This may be useful in very specific use cases such as for consistency when doing scaling tests; otherwise, it should be avoided.

Example:
#BSUB -n 16 
#BSUB -m "blade2a1" 
This requests that the job be run on blade2a1, which is the group name. The hosts it contains may be found by using bmgroup:
[unityID@login01]$ bmgroup blade2a1
GROUP_NAME    HOSTS                     
blade2a1     n2a1-1 n2a1-2 n2a1-3 n2a1-4 n2a1-5 n2a1-6 n2a1-7 n2a1-10 n2a1-8 n2a1-11 n2a1-9 n2a1-12 n2a1-13 n2a1-14  
back to script

-o Output file

Name of the file containing the standard output from the run. %J will attach the job ID to the file. This is optional.

Use -oo to write to an output file, overwriting any existing file.

Example:
#BSUB -oo output_file
This will write output to a file called output_file, and it will overwrite existing files with the same name.

If -o is used instead of -oo, the output will be appended to the existing file.

Example:
#BSUB -o output_file.%J
This will write standard output to a file called output_file.%J, where %J is the LSF job ID.
back to script

-e Error file

Name of the file containing the standard output from the run. %J will attach the job ID to the file. This is optional. Use -eo to write to an output file, overwriting any existing file.

Example:
#BSUB -e error_file%J
This will write standard error to a file called error_file.%J, where %J is the LSF job ID.
back to script

-J Jobname

Assigns a name to the job.

Example:
#BSUB -J program_T=200K 
This sets the job name in order to distinguish a job in the queue. It will appear as JOB_NAME when monitoring the job using bjobs.
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
884547  unityID RUN   gpu        login02.hpc n3h39       run_T=200K Oct 31 11:50
904168  unityID RUN   gpu        login02.hpc 8*n3h39     run_T=250K Nov  4 10:02
564888  unityID PEND  gpu        login04.hpc             run_T=300K Oct 10 13:20
back to script

-x Exclusive

Reserves the exclusive use of any node.

Example:
#BSUB -n 2 
#BSUB -R span[ptile=1] 
#BSUB -x
mpirun program
This requests 2 MPI tasks and specifies that each task must be on a separate node. This is usually done if the memory per task is high, or if the program will launch additional tasks or threads, necessitating more than the 2 single cores the scheduler would otherwise assign to the job. Exclusive use of the node, -x, must be specified to prevent spawning more jobs than the number of cores reserved, or to prevent using more memory than available per node. The exclusive option is not available for all queues.
back to script

Set the environment   Click here for more information on setting the compute environment.

Add the necessary executables and libraries to the run environment.

Example:
module load R/gcc_4.8.5_R-3.5.1
module load mpi/gcc_openmpi
This adds R and the openmpi libraries to the path.
back to script

Call the program

Run the program.

Example:
#BSUB -n 2
module load mpi/gcc_openmpi
mpirun ./a.out 
This is equivalent to using mpirun -n 2 ./a.out

Example:
#BSUB -n 1
module load python2 
python myscript.py
This loads the Python environment and runs a Python script.
back to script
Copyright © 2020 · Office of Information Technology · NC State University · Raleigh, NC 27695 · Accessibility · Privacy · University Policies