High Performance Computing

Getting Started with henry2 Linux Cluster

Page Contents:


Henry2 System Configuration

There are 1219 dual Xeon compute nodes in the henry2 cluster. Each node has two Xeon processors (mix of dual-, quad-, six-, eight-, ten-core, twelve-core) and 2 to 6 GigaBytes of memory per core. The total number of cores increases as more cores are purchased and now exceeds 10 thousand.

The nodes all have 64-bit processors. Generally 64-bit x86 executables will run correctly. Some 32-bit executables may run, but they are no longer supported. . 64-bit executables are required in order to access more than about 3GB of memory for program data.

The compute nodes are managed by the LSF resource manager and are not for access except through LSF (accounts directly accessing compute nodes are subject to immediate termination).

Logins for the cluster are handled by a set of login nodes which can be accessed as login.hpc.ncsu.edu using ssh.

Additional information on the initial henry2 configuration (c. 2004) is available in http://hpc.ncsu.edu/Documents/hpc_cluster_config.pdf.
Some additional informaion about the cluster architecture is available at http://hpc.ncsu.edu/Hardware/henry2_architecture.php.

Logging onto henry2 cluster

Normal login
    SSH access is supported to the login nodes (login.hpc.ncsu.edu). These are a set of nodes utilizing DNS round-robin load balancing. Logins are authenticated using Unity user names and passwords. Microsoft Windows users can use X-Win32 to log onto to login.hpc.ncsu.edu. To obtain X-Win32 go to ITECS Software page. Click on the Downloads button near the bottom, login, and then download and install the software X-Win32.

    If you are a MAC (or Linux) user, then in the Terminal, you can use the command

    ssh -l yourUsername login.hpc.ncsu.edu

    to log onto HPC login nodes.

    Login nodes should not be used for interactive jobs that take any significant amount of system resources. The usual way to run CPU intensive codes is to submit them as batch jobs to LSF, which schedules them for execution on computational nodes. Example LSF job submission files can be found in Intel Compilers. See LSF 9.1.2 for some more complete documentation.

Alternative login nodes (VCL)

    It is sometimes necessary to use interactive GUI based serial pre and post processors for data resident in the HPC environment. Interactive computing in the HPC environment should be performed by requesting a Virtual Computing Lab (VCL) HPC environment.

    To use the VCL HPC environment you need to send e-mail to

    oit_hpc@help.ncsu.edu

    to request to be added to "vcl" group first. After you are added to the vcl group you can go to the web page http://vcl.ncsu.edu and click on "Make a Reservation". If you have not already authenticated with your Unity ID and password you will be prompted to do so.

    From the list of environments, select "HPC (CentOS 7.2 64 bit VM)".

    (You cannot see the entry of "HPC (CentOS 7.2 64 bit VM)" if you have not been added to the "vcl" group.)

    When the environment is ready VCL will provide information regarding how to log in. VCL provides a dedicated environment, so heavy interactive use will not interfer with other users. If you have problems with using the VCL HPC environment, se nd e-mail to: oit_hpc at help dot ncsu dot edu.

    For more information about VCL HPC please go to Running Interactive jobs with with the HPC VCL image.

(New) HPC Linux Desktop environment via Remote Desktop connection to HPC-VCL
  • Linux Desktop Environment on the HPC
  •   Do you need any of the following:
      Desktop Environment on the HPC?
      Click and drag folders/files from your computer to the HPC file system?
      Click and drag folders/files from the Andrew File System (AFS) to the HPC file system?
      Run a web browser client on the HPC - to click and download files from web servers to the HPC?
      Run a web browser client to access the iRODS file system - to click to upload or download files between the iRODS file system and the HPC file system?
      Now this is possible with a Remote Desktop program to access the HPC login node in the VCL.
      To try it out: For users that have already used the HPC login node in the VCL (HPC-VCL), the instructions will only take a few minutes - please follow the instructions and try it out. For first time users of the HPC-VCL, they must first request access to that image: Send a request to oit_hpc@help.ncsu.edu with "access HPC-VCL login node" in the subject line.   When access is granted, go to vcl.ncsu.edu
    • Click on "Reservations"
    • Click the "New Reservation" button
    • Under the "Please select the environment you want to use from the list:" dropdown list, select "HPC (CentOS 7.2 64 bit VM)"
      • Tip: If you type "hpc" followed by a space in the dropdown box, the list will be filtered to only show the HPC environment
    • Select how long you would like to reserve the environment next to the "Duration" dropdown box
    • Click "Create Reservation"
    • The reservation is usually ready in less than 5 minutes, you will see a "Connect" button when it's ready, and an email will notify you as soon as it is ready.
    • Click "Connect" and follow the instructions you see
      Clicking the "Connect" button will yield a pop-up window, showing two options for logging in.
    • Connect to reservation using SSH
      • You are probably already familiar with this option. Use an SSH client such as the ssh command on Linux or the PuTTy application on Windows to interact with the remote VCL computer in a text-based terminal.
    • Connect to reservation using xRDP for Linux NEW!
      • This is the new, graphical connection option.  Near the bottom of the Connect dialog box, click "Get RDP file" to download a file that contains the connection information for your current reservation.
      • On most systems, you can simply execute the downloaded .rdp file and it will automatically launch your computer's remote desktop program and connect to the remote VCL computer.
      • This process has been validated for Windows machines. On Linux and Mac machines, using the Remote Desktop App with the username and password should work. If Remote Desktop didn't come with your Mac OS X you can download the free Microsoft Remote Desktop app from the App Store. For Mac machines, in the Remote Desktop app, go to Preferences > Security tab and select "Always connect, even if authentication fails". For Linux,  you may want to use xfreerdp.
      • Once logged into the remote VCL computer, the screen may appear black for about 35 seconds. Then you will see a Linux desktop environment:
      • A command-line terminal can be opened with Applications > Terminal
      • The user's HPC home directory can be browsed by clicking on the Home icon
      • HPC file systems such as gpfs_share as well as AFS file space including the user's AFS home directory can be browsed by:
        • Click on the Home icon
        • Clicking on "Other Locations" at the bottom of the left column
        • Opening "Computer" in the main pane:
      • Files from the local computer the user is connecting from may be accessed on the remote VCL computer by clicking on the thinclient_drives icon on the desktop:

        This will open a window containing folders that correspond to the drives on the user's local computer. In the example below, the C: folder corresponds to the C: drive on the Windows computer used to connect via Remote Desktop: (Note: Performance may be slow when browsing or copying files via thinclient_drives. For larger file transfers, the user will be better off using an SCP or SFTP client.)
      • Do you need to use the iRODS data grid file system? The HPC-VCL is also an iRODS client, and the iRODS file system can be accessed by it: https://datascience.oit.ncsu.edu/faq/#q1
    Once you have an initialized iRODS account, you can start a Firefox web browser client on the HPC-VCL, then go to:  irods-cloudbrowser.hpc.ncsu.edu  

    File Systems

    AFS files are not available from the cluster (but are available on the VCL HPC environments described above).
    Home Directory

      Users have a home directory that is shared by all the cluster nodes. Also, the /usr/local file system is shared by all nodes. Home file system is backed up daily, with one copy of each file retained.

      Home file system quota is intentionally small to ensure that the entire file system can be restored quickly if necessary since cluster operation is dependent on presence of home file system. Large files and datasets should be stored on other file systems.

    Scratch File Systems
      A shared scratch file system /share is available to all users. This file systems are not backed up and files may be deleted from the file systems automatically at any time, use of these file systems is at the users own risk. There is a 10TB group quota on /share. To find the group directory in which the user bfoo has permission to read and write, on a login blade, type

      grep bfoo /etc/group
      

      A parallel file system /gpfs_share is also available. Directories on /gpfs_share can be requested. There is a 1TB group quota imposed on /gpfs_share. /gpfs_share file system is not backed up and files are subject to being deleted at any time. Use is at the users own risk.

    Mass Storage

      Finally, from the login nodes the HPC mass storage file systems, /ncsu/volume1 and /ncsu/volume2, are available for storage in excess of what can be accomodated in /home. Since these file system are not available from the compute nodes, they cannot be used for running jobs.

    User files in /home, /ncsu/volume1, and /ncsu/volume2 are backed up daily. A single backup version is maintained for each file. User files in all other file systems are not backed up.

    Important files should never be placed on storage that is not backed up unless another copy of the file exists in another location.

    HPC projects are allocated 1TB of storage in one of the HPC mass storage systems (volume1 or volume2). Additional backed up space in these file systems can be purchased or leased.

    Additional information about storage on HPC resources is available from http://hpc.ncsu.edu/Documents/GettingStartedstorage.php

    Software

    Many software packages have already been compiled to run on the cluster. If you click on Software in the left toolbar or on http://hpc.ncsu.edu/Software/Software.php , you'll see a list of software. In many cases, there are "HowTos" which explain how to get access and submit example jobs. Suggestions on documentation updates and on additional software are encouraged.

    Compiling

    There are three compiler flavors available on the cluster: 1) the standard GNU compilers supplied with Linux, 2) the Intel compilers, and 3) the Portland Group compilers.

    The default GNU compilers are okay for compiling utility programs but in most cases are not appropriate for computationally intensive applications.

    Overall the best performance has been observed using the Intel compilers. However, the Intel compilers support very few extensions of the Fortran standard - so codes written using non-standard Fortran may fail to compile without modifications.

    The Portland Group compilers tend to be somewhat less syntacticly strict than the Intel compilers while still generating more efficient code than the Gnu compilers.

    For some pointers on using common tools to port codes, see Makefile, Configure, Cmake. Additional information about use of intel, pgi and gnu compilers is available from the following links. Generally objects and libraries built with different compiler flavors should not be mixed as unexpected behavior may result.

    Programs with memory requirements of more than ~1GB should review the following information.
    A note on compiling executables with large (> ~1 GB) memory requirements

    Running Jobs

    The cluster is designed to run computationally intensive jobs on compute nodes. Running resource intensive jobs on the login nodes, while technically possible, is not permitted.

    Please limit your use of the login nodes to editing and compiling, and transferring files. Running more than one concurrent file transfer program (scp, sftp, cp) from login nodes is also not desirable.

    Running Interactive Jobs

    There are several ways to run interactive jobs on the Henry2 cluster: Interactive jobs on the HPC-VCL log-in node, and interactive jobs via the LSF scheduler.

    • Interactive jobs via the HPC-VCL log-in node

      This log-in node is just like the standard log-in nodes, but you don't share it with other users - it is your own. This means that you can run interactive jobs that are compute or memory intensive; like visualizing graphic intensive plots or movies with the full power of the node without affecting other users (or other users affecting you).

      To gain access to the HPC-VCL log-in node, send a request to oit_hpc@help.ncsu.edu with "access to the HPC-VCL log-in node" in the subject line.

      When access is granted, goto vcl.ncsu.edu > Reservations > New Reservation > HPC... > Create Reservation > Connect


      Clicking the Connect button will yield two options for logging in
      (a) X11 terminal window: Using the given IP address, ssh in with a ssh client like Putty
      (b) Remote Desktop program: Near the bottom of the gui, click "get RDP file", then click on the download. Once logged into the node, you will see a Linux Desktop environment. A command-line-terminal can be opened by
      Applications > Terminal

      Logging in via Remote Desktop is a great way to vizualize movies or intense graphics, because all of the graphics processing is done on the site of the HPC-VCL node - as opposed to graphics data being tunnelled through an X11 tunnel to your laptop.

      It is also a great way to use gui-intensive programs like Matlab, VMD, Fluent and SAS. Try:

      source /usr/local/apps/MATLAB/matlab2017b.csh
      matlab 
      
      source /usr/local/apps/vmd/vmd-1.9.1.csh vmd
    • Interactive jobs via the LSF scheduler

      You may use bsub interactively to log-in to the compute nodes that LSF/bsub reserved for you. You may then execute your bsub script line by line if you like. For instance, to log on to the NVIDIA P100 gpu node:

      bsub -I -q gpu -R "select[ngpus>0] rusage[ngpus_shared=2]" -m n3h39 tcsh
      Job <443595> is submitted to queue .
      Waiting for dispatch ...
      Starting on n3h39
      
      Then doing
      nvidia-smi
      
      gives details about the GPU card
    Running Serial Jobs

    To run computationally intensive jobs on the cluster use the compute nodes. Access to the compute nodes is managed by LSF . All tasks for the compute nodes should be submitted to LSF.

    The following steps are used to submit jobs to LSF:

    • Create a script file containing the commands to be executed for your job:
      #!/bin/csh
      
      #BSUB -o standard_output
      #BSUB -e standard_error
      
      cp input /share/myuserid/input
      cd /share/myuserid
      ./job.exe < input
      cp output /home/myuserid
      
      
    • Use the bsub command to submit the script to the batch system. In the following example two hours of run time are requested:
      bsub -W 2:00 < script.csh
      

    • The bjobs command can be used to monitor the progress of a job
    • The -e and -o options specify the files for standard error and standard output respectively. If these are not specified the standard output and standard error will be sent by email to the account submitting the job.
    • The bpeek command can be used to view standard output and standard error for a running job.
    • The bkill command can be used to remove a job from LSF (regardless of current job status).
    For running many instances of a job with differing data files or other systematic changes in the job submission file, you may want to automate the submission of jobs by writing scripts. A few examples are in the Perl HowTo.

    Running MPI Parallel Jobs

    Here's a sample bsub job submission file for running a code compiled with intel compilers and an openmpi library. The following job submission file "bfoo" ran "ringping" in parallel on 16 cores.

     
    #!/bin/csh
    
    #BSUB -W 15
    #BSUB -n 16
    #BSUB -R span[ptile=4]
    #BSUB -q single_chassis
    #BSUB -o out.%J
    #BSUB -e err.%J
    
    source /usr/local/apps/openmpi/intel2013_ompi.csh
    mpirun ./ringping
    
    The job should be submitted from login01, login02, login03, or login04 (April 2017) by the command
    bsub < bfoo
    
    The job asks for 16 cores (-n 16), 15 minutes (-W 5), runs with 4 cores per blade (-R span[ptile=4] ). It generates a standard error file err.xxxxx and a standard output file out.xxxxx. It's submitted to the single chassis queue, for which we could request up to 56 cores and 5760 minutes (4 days).

    To set the use of intel compilers before compiling, use the same source command as in the job submission file above. For pgi compilers, you would use

     
    source /usr/local/apps/openmpi/ompi184_pgi151.csh
    

    For gnu compilers (discouraged because gnu compiled codes typically run more slowly, where the usual point of parallel computing is to get jobs to run more quickly)
     
    source /usr/local/apps/openmpi/openmpi_gcc.csh
    

    Running MPI Parallel Jobs with Infiniband

    The cluster nodesare connected by a Gigabit network. A limited number of nodes are connected by a lower latency infiniband network.

    For current mpi libraries, the same executable should be able to use either GiGE or inifinband networks.

    Running Shared Memory Parallel Jobs

    Henry2 nodes are a mix of dual, quad, six-core, eight-core, ten-core and twelve-core processors where each node has two processors. Thus total processor cores per node range from 4 to 24. All the processor cores on a node share access to the all of the memory on the node. Individual nodes can be used to run programs written using a shared memory programming model - such as OpenMP.

    Below is a sample shared memory job script that uses 16 processor cores.

     
    #!/bin/csh
    
    #BSUB -o out.%J
    #BSUB -e err.%J
    #BSUB -n 16
    #BSUB -R span[hosts=1]
    #BSUB -W 15
    
    setenv OMP_NUM_THREADS 16
    ./exec
    

    The number of job slots requested, -n 16 in this example, needs to match the number of threads the parallel job will use (OMP_NUM_THREADS). The resource request must specify span[hosts=1] to ensure that LSF assigns all the requested job slots on the same node - so they will have access to the same physical memory.

    See the individual compilers for the flags needed to compile codes to enable OpenMP shared memory parallelism. Short course lecture notes on Openmp from the fall of 2009 give some instructions for converting a Fortran or C code to use OpenMP parallelism.

    Requesting Specific Amount of Memory

    If your job needs a large amount of memory (RAM) then you can use the syntax -R "rusage[mem=6000]", where the 6000 specifies the amount of memory which is in unit of MB. One thing that needs special note is that the memory request rusage[mem=6000] is a per job slot resource request. Thus, the total memory requested will be 6000 multiplied by the number of job slots you specify by -n.

    Below is an example job script which has -n 16 and rusage[mem=8000]. Thus the total amount of memory that the following example job script requests is 8000 x 16 = 128,000 MB. That is the total amount of memory that the job script requests is about 128 GB.

    #!/bin/csh 
    
    #BSUB -o out.%J
    #BSUB -e err.%J
    #BSUB -n 16
    #BSUB -R "rusage[mem=8000]  span[hosts=1]" 
    #BSUB -W 15
    #BSUB -q shared_memory 
    
    setenv OMP_NUM_THREADS 16
    ./exec
    

    In September, 2016, the maximal amount of RAM available for a shared_memory queue job was 512 GBytes. 9 nodes have 128 GBytes of RAM, and 3 nodes have 512 GBytes.

    Job Queues and LSF

    A number of LSF queues are configured on the henry2 cluster. Often the best queue will be selected without the user specifing a queue to the bsub command. In some cases LSF may override user queue choices and assign jobs to a more appropriate queue.

    Jobs requesting 16 or fewer processors and 100 minutes or less time are assigned to the debug queue and run with minimal wait times. Once a user is satisfied a job is running well, more time will typically be requested.

    Queues available to all users support jobs running on up to 256 processors for two days or jobs running for up to 15 days on up to 16 processors. Jobs that need up to two hours and up to 32 processors are run in a queue that has access to nearly all cluster nodes [generally the queues open to all users only have access to nodes that were purchased with central funding]. Jobs that require 56 or fewer processors and up to 4 days are placed in the single chassis queue. Jobs in this queue are scheduled on nodes located within the same physical chassis - resulting in better message passing bandwidth and lower latency for messages.

    Partners, those who have purchased nodes to add to the henry2 cluster, may add the bsub option -q partnerqueueame to place their job in the partner queue. Partner queues are dedicated for use of the partner and their project and have priority access to the quantity of processors the partner has added to the cluster.

    A note on LSF job scheduling provides some additional details regarding how LSF is configured on henry2 cluster.

    LSF writes some intermediate files in the user's home directory as jobs are starting and running. If the user's disk quota has been exceeded, then the batch job will fail, often without any meaningful error messages or output. The quota command will display usage of /home file system.