LSF resource strings

Resource strings are used to ensure that a job will be placed on nodes that have the resources the job requires. There are resources that are native to LSF, such as the number of cores or memory needed, and there are resources that have been defined locally for the Henry2 cluster, such as model name or node type.

Resource strings may contain a number of sections:

  • span section that specifies how tasks should be distributed across nodes
  • select section that specifies criteria for selecting eligible nodes to run the job
  • rusage section that specifies resource consumption anticipated for the job
  • order section that specifies criteria for sorting eligible nodes
  • same section that specifies node characteristics that must be same for all nodes assigned to a parallel job
  • cu section that specifies topological requirements for a parallel job
  • affinity section that specifies CPU and memory binding requirements

Requesting a specific queue

To specify a queue, use -q queue_name.

If no queue was specified, LSF will choose the most appropriate queue based on the number of cores and time requested from the set of default queues. See this FAQ for additional information on choosing a queue.
Default queues
debug
serial
serial_long
standard
short
long
single_chassis

There are several other queues with access to special resources.
Specialty queues
gpu - Nodes that have attached Nvidia GPUs
standard_ib - InfiniBand network, homogeneous jobs (all nodes must be the same type)
mixed_ib - InfiniBand network, heterogeneous jobs (nodes chosen may be of different types)

The queues available to a user can be displayed by using bqueues -u user_name.
The properties of a queue can be displayed by using bqueues -l queue_name.

Queue priority is determined by several factors including fair share priority, queue priority, and time of submission.

  • See further details on queue priority.
  • Define task distribution using span

    Shared memory jobs must be placed on a single node, or host. Some memory intensive MPI jobs or hybrid parallel jobs must limit the number of tasks per node.

    To specify a span type, use -R "span[span_type=#number]".

    Span type    Description
    hosts Maximum number of hosts to confine tasks
    ptile Maximum number of tasks per host

    Specify consumption using rusage

    Set a usage case, such as higher memory. Usage is per host. See the generic LSF template for specific examples.

    To specify a usage type, use -R "rusage[usage_type=#number]".

    Usage type          Description
    mem Memory requirements

    Define resources using select

    To specify a resource, use -R "select[Resource_type]".

    LSF will not show an error if the user specifies a combination of resources that do not exist. For example, -R "select[hc model==Gold6130]" would result in job pending indefinitely as the Gold6130 model processors have 16 cores and hc requests processors with six cores.

    The following is a list of the types of resources available on Henry2 and a description of each.

    Resource by processor type

    The required cores per node is specified by the number of cores per processor. Each node has two of the same model Intel Xeon processors. A specification of hc would select a node with two six-core processors, i.e., a node with 12 cores.

    Resource type    Description
    sc Processor model with single core
    dc Processor model with two (dual) cores
    qc Processor model with four (quad) cores
    hc Processor model with six cores
    oc Processor model with eight cores
    tc Processor model with ten cores
    twc Processor model with twelve cores
    stc Processor model with sixteen cores
    Resource by instruction set architecture

    Software compiled on one type of architecture may not run on another type of architecture, resulting in an error of illegal instruction. LSF resources may be used to specify the instruction set architecture (ISA).

    Resource type    Description
    sse Processor model with SSE instructions
    sse2 Processor model with SSE2 instructions
    ssse3 Processor model with SSSE3 instructions
    sse4_1 Processor model with SSE4 v1 instructions
    sse4_2 Processor model with SSE4 v2 instructions
    avx Processor model with AVX instructions
    avx2 Processor model with AVX2 instructions
    Resource by GPU model

    Similar to the ISA compatibility issues described above, a given software may not be compatible with all models of GPU.

    Resource type    Description
    rtx2080    Node with attached Nvidia RTX 2080 GPU
    gtx1080    Node with attached Nvidia GTX 1080 GPU
    p100 Node with attached Nvidia P100 GPU
    k20m Node with attached Nvidia K20m GPU
    m2070 Node with attached Nvidia M2070 GPU
    m2070q Node with attached Nvidia M2070Q GPU
    m2090 Node with attached Nvidia M2090 GPU
    Resource by interconnect

    The type of interconnect may be specified.

    LSF will not show an error if a job is placed in a queue not containing the specified interconnect. For example, when using ib, the job must be placed on a queue containing nodes with InfiniBand. Queues available to all users that have ib include standard_ib and mixed_ib.

    Resource type    Description
    ib    InfiniBand
    e10G 10G ethernet

    Henry2 node models

    Model definitions used for Henry2 nodes. These correspond to Intel Xeon model numbers of the processors on the nodes. Each node has two of the same model Intel Xeon processors. Here is a site with filter and search capabilities that lists processor model specifications.

    To specify a specific model of processor, use -R "select[model==model_number]".

    The following is a list of the model numbers currently available on Henry2.

    X5130               L5335               E5335               E5405               E5504              
    E5520 L5535 E5540 E5620 L5640
    E5645 X5650 HE8374 E52640 E52640v2
    E52650 E52650L E52650v2 E52650v3 E52650v4
    E52690 Gold6130 Silver4108

    Examples

    Run a job with 48 tasks (-n 48) on four nodes with 12 cores per node. bsub -n 48 -W 120 -R "select[hc] span[ptile=12]" < job_script_name
    or using special select syntax bsub -n 48 -W 120 -R "hc span[ptile=12]" < job_script_name
    Nodes have two processors and the resource name defined for nodes with 6-core processors is hc. This job would fully occupy 4 nodes.


    Run a job with 50 tasks with the tasks distributed 10 per node. bsub -n 50 -W 200 -R "span[ptile=10]" < job_script_name
    This resource string does not specify anything about the node selection criteria beyond needing 10 cores on each node. If the job were scheduled on nodes with 12 cores per node it is possible that LSF would schedule other jobs on the nodes being used for this job to occupy the remaining cores. In general, it is desirable to fully utilize nodes to avoid potential contention from other jobs.

    More examples
    See the generic template for creating a detailed batch script for more information and examples about how to specify LSF resources.

    External links for more information on hardware

    General hardware

  • Understanding performance and obtaining hardware information, a webinar by SDSC (San Diego Supercomputer Center), December 2018. This webinar discusses how to find CPU specs, memory quantity, cache configuration, file systems, OS version, GPU properties, and instruction sets (ISA). It also discusses why one should understand them, and it includes a short discussion of profiling, monitoring, and optimizing. Original link to all SDSC webinars is here.
  • Here is a subset of the commands reviewed in the above webinar
    uname -a                    # OS and kernel info
    cat /etc/centos-release     # Linux distribution
    lscpu                       # CPU info
    cat /proc/cpuinfo           # processor info
    cat /proc/meminfo           # memory info
    nvidia-smi                  # GPU info
    
  • Copyright © 2020 · Office of Information Technology · NC State University · Raleigh, NC 27695 · Accessibility · Privacy · University Policies