To run an application, submit a batch job.

The cluster is designed to run computationally intensive jobs on compute nodes. Running resource intensive jobs on the login nodes is not permitted. The user must submit a batch script or request an interactive session via the job scheduler. Henry2 uses the LSF resource management software for scheduling jobs.

A user supplies a list of resource requirements to LSF using a text file, called a batch script, or via the command line. It is necessary to specify the number of cores and the time required. Various resource requirements may be added, (e.g. memory, processor model, processor instruction, GPU).

Sample batch scripts should be examined and modified as appropriate before submitting production jobs to LSF.

Quick Links

Step 1: Create LSF batch script

Basic batch scripts

Serial code

The following shows the contents of a text file called run_mycode.csh, which is a basic LSF batch script to run a serial application mycode.exe :

#!/bin/tcsh
#BSUB -n 1
#BSUB -W 120
#BSUB -J mycode
#BSUB -o stdout.%J
#BSUB -e stderr.%J
mycode.exe
The #BSUB directives take the place of specifying options to bsub via the command line. In this example, one core (-n 1) is being requested for a maximum of 120 minutes (-W 120). If the application were to run longer than 120 minutes, LSF would automatically terminate the job.

The -J directive is the name that LSF will display for the job. This provides a way to differentiate multiple jobs running at same time.

The other two directives tell LSF where to put standard output and standard error messages from the job. The %J will be replaced by the job ID when the job begins. The files will be written to the directory the job is submitted from (the current working directory when the bsub command is invoked).

Parallel code
-> Read this note before submitting parallel jobs.

The following shows the contents of a text file called run_my_parallel_code.csh, which is a basic LSF batch script to run an application my_parallel_code.exe :

#!/bin/tcsh
#BSUB -n 4
#BSUB -W 120
#BSUB -q shared_memory
#BSUB -J mycode
#BSUB -o stdout.%J
#BSUB -e stderr.%J
my_parallel_code.exe
In this example, four processor cores (-n 4) are being requested. Specifying the shared_memory queue will ensure the 4 cores are on a single node. Another way to ensure the cores are on one node is to use #BSUB -R span[hosts=1]. Use hosts=1 if the run should go to the debug queue, otherwise use shared_memory. Verify that an application can be run in distributed memory (across multiple nodes) before leaving out this resource specification. For more information on this, see the FAQ.

Advanced batch scripts

There are many ways to specify job parameters and resources, including memory requirements, processor model, supported instruction sets, and use of a GPU.

Step 2: Submit LSF job

Batch job

To submit a job to a compute node, the batch script run_mycode.csh is submitted to LSF using the command bsub < run_mycode.csh.

Interactive job

To run interactive processes that require remote display (GUI), use an HPC-VCL node.

The HPC-VCL is reserved for applications that require interaction with a display, such as visualization software.

To test an application for the purposes of creating a working batch script, start a short interactive session on a compute node.

Production jobs should always be run via a batch script. It may be necessary to interact with an application at the command line to determine how to properly write the script. Users may not do this testing on a login node.

To interact with a compute node via the command line, submit a request to LSF by using bsub -Is, other LSF parameters such as time and processors needed, followed by tcsh. To request an interactive session using 4 cores, with all cores on the same node, and 10 minutes of wall clock time:
bsub -Is -n 4 -R "span[hosts=1]" -W 10 tcsh

Do not request more than one core for a serial job. For serial jobs, choose -n 1, and if the memory requirements are high, specify the memory required or request exclusive use of the node by using -x, e.g.,
bsub -Is -n 1 -x -W 10 tcsh

Interactive sessions must be kept to a minimum and only used when necessary. Nodes left idle or underutilized by long running interactive sessions may be terminated, as this violates the Acceptable Use Policy.

Step 3: Monitor LSF job

Job status

The bjobs command is used to monitor the status of jobs after they are submitted to LSF. An LSF job status is usually in one of two states (STAT): PEND means that the job is queued and waiting for resources to become available and RUN means that the job is currently executing. Typical bjobs output looks like:

[unityID@login01 ~]$ bjobs
JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME
948851  unityID  RUN   standard   login01.hpc bc2j6       myjob1    Mar 25 16:15

The first column shows the job ID.
bjobs -l JOBID
will provide detailed information about the job. For jobs in a pending state, this will give information about why the job is pending. It is possible to make resource requests that are impossible to satisfy; jobs that have been pending for a long time may have made that type of request. For guidance on interpreting the output of bjobs, see the LSF FAQ.

Some LSF paramaters may be modified for pending or running jobs using the bmod command. For example, to change the wall clock limit for a pending or running job, use bmod -W "[new time]" [job ID]. See the IBM LSF documentation for more details.

A pending job may be removed from the queue or a running job may be terminated by using the bkill command. The job ID is used to specify which job to remove.
bkill JOBID
To terminate all running and pending jobs, use bkill 0.

Queue status

The bqueues command provides an overall view of LSF and some indication of how busy various queues may be. The following shows that in the long queue, there are a maximum of 200 job slots (or tasks) available. Of the 161 tasks submitted, 129 are running and 32 are pending.

[unityID@login04 ~]$ bqueues long
QUEUE_NAME      PRIO STATUS          MAX JL/U JL/P JL/H NJOBS  PEND   RUN  SUSP 
long             45  Open:Active     200   32    -    -   161    32   129     0

Cluster status

The cluster status page provides information about available nodes and can help inform resource requests.

Sample MPI script

#!/bin/tcsh
#BSUB -n 32
#BSUB -J test1_hydro
#BSUB -W 2:30
#BSUB -R "select[avx2]"
#BSUB -o test1_hydro.out.%J
#BSUB -e test1_hydro.err.%J
#BSUB -q standard_ib
module load PrgEnv-intel
mpirun ./hydro.exe

This will run an MPI code called hydro.exe. It will launch 32 MPI tasks and is not expected to run longer than 150 minutes. To ensure all nodes are of the same type (standard) and the network used is InfiniBand (ib), the standard_ib queue is specified. The code was compiled with Intel MPI, and the runtime environment must be set to match the compile environment. It was also compiled with optimizations for AVX2 instruction set architecture; it may not run on older nodes.

Sample hybrid MPI-OpenMP script

#!/bin/tcsh
#BSUB -n 6                      # Number of MPI tasks
#BSUB -R span[ptile=2]          # MPI processes per node
#BSUB -x                        # Exclusive use of nodes
#BSUB -J chemtest1              # Name of job
#BSUB -W 2:30                   # Wall clock time
#BSUB -o chemtest1.out.%J       # Standard out
#BSUB -e chemtest1.err.%J       # Standard error
#BSUB -q standard_ib            # Queue
module load openmpi-gcc         # Set environment
setenv OMP_NUM_THREADS 4
mpirun ./chemtest1.exe

This will run a hybrid MPI-OpenMP code called chemtest1.exe. It will launch 6 MPI tasks and is not expected to run longer than 150 minutes. Two tasks (span[ptile=2]) will be placed on each of 3 nodes. Four threads will be spawned for each MPI task, so each node will need to have a minimum of 8 cores. LSF schedules by the number of tasks requested, so -x must be used to prevent contention with other jobs or overloading the node. To ensure all nodes are of the same type (standard) and the network used is InfiniBand (ib), the standard_ib queue is specified. The code was compiled with GNU OpenMPI, and the runtime environment must be set to match the compile environment.

Sample CUDA script

#!/bin/tcsh
#BSUB -n 1
#BSUB -W 30
#BSUB -q gpu
#BSUB -gpu "num=1:mode=exclusive_process:mps=yes"
#BSUB -o out.%J
#BSUB -e err.%J
module load PrgEnv-pgi
module load cuda
./nnetworks.exe

This will run a CUDA code called nnetworks.exe. It will use one core and is not expected to run longer than 30 minutes. Using the gpu queue will place the job on a GPU node, and the -gpu specifier will allow the use of one GPU on the node (num=1). The code was compiled using PGI and the CUDA libraries, and the runtime environment must be set to match the compile environment.

Copyright © 2019 · Office of Information Technology · NC State University · Raleigh, NC 27695 · Accessibility · Privacy · University Policies