Legend

    Henry2 active use by node processor model










processor
model
number
of nodes
number
of cores
available
nodes
available
cores
empty
nodes
X5130 1 2 1 2 1
E5310 14 112 14 112 14
E5335 14 112 14 112 14
E5405 81 648 63 428 39
E5504 42 336 33 263 32
E5520 174 1392 168 1344 168
E5540 34 272 28 170 16
E5620 87 808 66 599 64
L5535 34 272 34 272 34
E5620 87 808 66 599 64
E5645 119 1428 104 1021 65
L5640 1 12 1 12 1
X5650 55 660 40 480 42
E52640 4 64 2 20 1
E52640v2 47 752 23 284 16
E52650 51 816 33 428 21
E52650L 8 128 5 66 4
E52650v2 6 96 6 96 6
E52690 4 64 2 32 2
HE8374 4 64 2 32 2
Silver41 1 16 1 16 1
E52650v3 74 1480 35 635 30
E52650v4 43 1032 32 709 20
Gold6126 1 24 1 10 0
Gold6130 64 2048 32 862 22

    Total nodes online: 1050
    Total cores online: 13446
    Total cores available: 8604
    Utilization: 36.01 %

    Data updated: 2019-10-17

HPC News

  • 3 October 2019 - cmake update and mpi module rename

      The new default module for cmake is version 3.15.4.

      The module mpi/gcc_openmpi has been removed. The following will give the same environment:
      module load openmpi-gcc/openmpi1.8.4-gcc4.8.2
  • 23 September 2019 - Change in requesting memory resources

      Previously memory resource requests such as -R "rusage[mem=500]" applied per task This type of memory resource request now applies per HOST - so if previously requested mem=500 and n=4, with span[ptile=4], now would request mem=2000 to reserve the same amount of memory on the host.
  • 23 September 2019 - New bsub GPU syntax

      Following the recent LSF updates there is a new bsub syntax for using GPUs. Following is a suggested option to request one GPU per host assigned to a job -gpu "num=1:mode=exclusive_process:mps=yes” or in script #BSUB -gpu "num=1:mode=exclusive_process:mps=yes”
  • 23 September 2019 - GPU usage Update

      Access to the GPUs has been interrupted due to implementing the new syntax of the updated LSF scheduler. We are working to resolve this problem as soon as possible.
  • 15 September 2019 - LSF Update

      8 September LSF update did not complete. Currently at 10.1.0.7 - will apply patch to bring version to 10.1.0.8 on Sunday 15 September. Anticipate LSF may be unavailable for much of the morning.
  • 08 September 2019 - LSF Update

      LSF will be updated to latest patch for version 10.1 - there may be brief periods where 'not responding' messages will display when using LSF commands while the updates are being applied. No impact is anticipated for running jobs. However, slight delays in starting new jobs may occur as the system is restarted.
  • 24 June 2019 - jhl* compute nodes network disruption

      Due to scheduled maintenance on network equipment at the James B. Hunt Library, these nodes will briefly loose their network connection between 6am and 7am. It's advisable not to schedule any jobs for them during that period.
  • 10 April 2019 - CLC Genomics Server Update

      Starting 10 April 2019 the CLC Genomics Server on HPC Cluster will be unavailable to allow it to be updated to the current version. The server will be available again starting 15 April 2019. Client software will need to be updated to current version to be compatible with the server after the update.
  • 22 March 2019 - compute nodes jhl025-jhl028 unavailable

      Due to required maintenance on these nodes at the James B. Hunt Library, they'll be taken off-line. They will be re-opened in LSF once the maintenance is complete.
  • 3 February 2019 7pm - /gpfs_share partition is back on-line

    • 3 February 2019 10am - /gpfs_partners partition is back on-line

      • 1 February 2019 7pm - /gpfs_share and /gpfs_partners partitions are off-line

          The servers providing these partitions experience significant slowness. To investigate the cause the partitions were temporarily taken off-line. The updates will be posted when they become available.
      • 22 January 2019 - NFS exports unavailable & VCL-HPC reservations disabled

          1. Due to the upgrade of the OS and GPFS software on the server providing variuos NFS exports, they will be not available for a duration of this maintenance.
          2. VCL-HPC production image will be switched to the new version - "HPC (CentOS 7.5 64 bit VM)".
          The expected duration of this maintenance is 2 hours (8am-10am).
      • 02-03 January 2019 - jhl* compute nodes unavailable

          Due to the scheduled outage at the James B. Hunt Library, jhl* compute nodes will be unavailable during this period. As a preparation for this maintenance these nodes were closed in LSF for future jobs on 12/30/2018. They will be re-opened once the maintenance is complete.
      • 01 December 2018 - Gurobi License

          A floating Gurobi license is now available on Henry2. Also Gurobi 8.1.0 was installed. Use command module load gurobi to set up environment to use Gurobi 8.1.9
      • 30 November - 3 December 2018 - /rsstu is not available on the HPC cluster

          There were errors on a few disks in the storage unit providing /rsstu partition. As a result it was necessary to temporarily un-mount it on HPC cluster pending investigation by the vendor on Monday (12/03/2018). The updates will be posted when they become available.
      • 19 October 2018 - Henry2 OS upgrade

          For a next couple of months there will be a gradual OS upgrade to version 7.5 on compute and login nodes. This process should have minimal impact on the general availability of the cluster resources. If there are any issues, please, report it via creating NCSU Service Desk incident for OIT_HPC.
      • 22 September 2018 - Henry2 core Ethernet switch maintenance

          ComTech will be upgrading firmware on core Ethernet switches used by Henry2 cluster during the extended datacenter maintenance scheduled for September 22 and 23. Each switch will experience a few minute outage will new firmware is installed. These switches are used for storage access and job management.
      • 30 July 2018 - Monthly Maintenance on login nodes

          The login nodes will be taken off-line to apply OS security updates at the beginning of each month.
          Exact dates and times for each login node will be posted in MOTD (Message Of The Day), displayed on the screen upon logging in to relevant login node.
      • 9 July 2018 - Maintenance on login01,login02,login03[.hpc.ncsu.edu]

          These machines will be taken off-line to apply OS security updates at these times:
          10am - login01.hpc.ncsu.edu
          12pm - login02.hpc.ncsu.edu
          2pm - login03.hpc.ncsu.edu
          The expected duration of each maintenance is 2 hours.
          There should be no impact to the usual work flow after update, but if there is an issue, please, report it via creating NCSU Service Desk incident for OIT_HPC.
          UPDATE: The maintenance on login01 was completed at 11:00am.
          UPDATE: The maintenance on login02 was completed at 12:40pm.
          UPDATE: The maintenance on login03 was completed at 2:45pm.
      • 2 July 2018 - Maintenance on login04.hpc.ncsu.edu

          login04.hpc.ncsu.edu will be taken off-line starting 10am to apply OS security updates. The expected duration of this maintenance is 2 hours. There should be no impact to the usual work flow after update, but if there is an issue, please, report it via creating NCSU Service Desk incident for OIT_HPC.
          UPDATE: The maintenance was completed at 11am.
      • 27 June 2018 - Web Application Broken

          Database for research computing web application was migrated to new platform this morning. The application was not functioning correctly from about 3am until about 9:15am
      • 23 June 2018 - New Top Level HPC Web Pages

          As an interim step toward a full redesign of the HPC web site several new upper level pages have been made active. The old site, in its entirety is still available by selecting the 'Legacy Web Site' button on the new main page.
      • 1 June 2018 VMD 1.9.3

          VMD version 1.9.3 has been installed on henry2 cluster. VCL HPC login node reservation should be used for running VMD [see news item below related to remote desktop connection with HPC-VCL]. Use following command to set up environment for VMD 1.9.3:
          module load vmd/1.9.3
          
      • 1 June 2018 Amber 18

          Amber 18 is now available on henry2 cluster. It was built using Intel compiler, so Intel programing environment needs to be loaded in addition to environment for Amber 18. Use following commands to set up environment for Amber 18 (from either command line or batch script):
          module load PrgEnv-intel/2017.1.132
          module load amber/18
          
      • 1 June 2018 PGI 18.4

          PGI version 18.4 has been installed. Use following command to set up environment to use PGI 18.4 programming environment:
          module load PrgEnv-pgi
          
          Older versions can be selected by specifying version explicitly.. eg
          module load PrgEnv-pgi/18.1
          
      • 9 May 2018 Remote Desktop connection with HPC-VCL

          It is now possible to have a Linux Desktop environment on the HPC with the HPC-VCL login node.
          Learn more
      • 9 April 2018 /share, /gpfs_common, /gpfs_backup

          Starting about 8am these file systems will be unavailable to allow for a physical repair to be done to their storage array.
      • 22 March 2018 New Abaqus Version

          Abaqus version 2018 has been installed.

          module load abaqus

          will set up environment to use Abaqus 2018. Run command abaqus to invoke Abaqus.

          Previous versions can be accessed using

          module load abaqus/2016

          or

          module load abaqus/6.13-2

          and target for abaqus command will be adjusted appropriately

      • 18 March 2018 New Portland Group Compiler Version

          PGI version 18.1 has been installed.

          module load PrgEnv-pgi

          will set up environment to use the new version.

          [edsills@login01 ~]$ module load PrgEnv-pgi
          [edsills@login01 ~]$ module list
          Currently Loaded Modulefiles:
           1) pgi/18.1                          3) openmpi/2.1.2/2018
           2) netcdf/4.5.0/openmpi-2.1.2/2018   4) PrgEnv-pgi/18.1
          [edsills@login01 ~]$ which pgf90
          /usr/local/pgi/linux86-64/18.1/bin/pgf90
          [edsills@login01 ~]$ which mpif77
          /usr/local/pgi/linux86-64/2018/mpi/openmpi-2.1.2/bin/mpif77
          

          Older versions can be selected by specifying version explicitly.. eg

          module load PrgEnv-pgi/16.7

          Strongly recommend not to use any version older than 15.1

      • 03-04 March 2018 Network Switch Reboot

          During the extended data center maintenance scheduled for 03-04 March 2018 the ComTech switches that provide core network for the henry2 cluster will be rebooted to update software to latest version for compliance with security standards. The reboots will interrupt network connections between henry2 nodes and between nodes and storage. Running jobs attempting to communicate with other nodes or attempting to access storage during these events will most likely fail.

          Please plan job submissions to avoid having critical jobs running this weekend.

      • 04 January 2018 LSF Update

          Saturday 04 Jan 2018 the version of LSF on henry2 cluster will be updated from 8.3 to 10.1

          There should be no impact to running jobs. New job submissions and new jobs starting will be temporarily disabled while the upgrade is in progress. Expect that interruption to new jobs will be less than 4 hours.

      • 01 December 2017 New LSF Resource Definitions

          The following new LSF resources have been defined:
          • sse
          • sse2
          • ssse3
          • sse4_1
          • sse4_2
          • avx
          • avx2
          These correspond to the various vector instruction sets supported on Intel Xeon processors. These can be used in the bsub resource string to specify that a job be scheduled on node(s) whose processors support the specified instruction set.
      • 26 October 2017 Gaussian 16.a03 Installed

          Gaussian 16 and associated GaussView have been installed on henry2 cluster. Use add g16 command to configure shell environment to use these new versions.
      • 22 August 2017 Shared file system upgrade/change completed

          We completed shared file system upgrade/change on 8/21/2017. The old /share, /share2, /share3 are moved to /gpfs_common/old_share, /gpfs_common/old_share2, /gpfs_common/old_share3, respectively, and the old data there will be preserved for 20 days. After that, the old data will be wiped out.

          NOTE: You do not have read permission in /gpfs_common anymore and thus you cannot do ls there. To access your data on the old shared file system, you need to type the full path to your own subdirectory, such as /gpfs_common/old_share/your-user-name, when you do cd on login nodes or when you provide folder name in WinSCP.

          The new shared file system you can access is

          /share/your-group-name

          where your-group-name is the first group when you type the command "groups". You can cd into /share/your-group-name and use the command

          mkdir your-user-name

          to create your own subdirectory, and store data and run jobs in your own subdirectory, where your-user-name is your HPC username.

          Each group has a 10TB quota for their group directory /share/your-group-name. As before /share is not backed up and files that have not been recently accessed are automatically deleted (currently purge is set to remove files that have not been accessed for 30 days).

      • 21 August 2017 Henry2 Cluster Unavailable

          Cluster will be unavailable midnight-noon. A failing Ethernet switch will be replaced. The switch replacement will interrupt connections to storage for login nodes and many compute nodes.

          Since many jobs would be impacted anyway, the outage will also be used to move /gpfs_common and /gpfs_backup to new hardware. Therefore the originally announced hour maintenance is being extended to be 12 hours, but will avoid need for another interruption to running jobs in near future.

          A new organization of /share, /share1, and /share2 will be implemented on the new storage that will provide additional scratch quota to all HPC projects.

          Job scripts will need to be changed to reflect the new directory structure.

      • 11 August 2017 Henry2 /home quotas

          Quotas on /home had not been enforced since July 5th move of /home to new storage hardware. As of this morning quotas are again being enforced on /home. Also, the quota command is working normally again.
      • 5 July 2017 Henry2 Cluster Unavailable

          Cluster will be unavailable starting approximately 8am July 5 to allow /home and /usr/local file systems to be moved to new storage hardware. Cluster is expected to be available again late day July 5th.

          Jobs running at 8am will be lost. Queues will be disabled over July 4th holiday to minimize number of lost jobs.

      • 27 June 2017 henry2 /home and /usr/local

          /home and /usr/local file systems on henry2 cluster were offline due to partial power outage in data center overnight (about 9pm). File systems were recovered and back online around 9am. Work continues restarting compute nodes that did not come back online cleanly following the power outage.
      • 23 April 2017 Network maintenance

          ComTech will be performing maintenance on the network switches which form the core for the HPC cluster starting at 1pm and expected to take approximately 3 hours. During this time various parts of the cluster will be impacted as each of the 6 switches are updated.

          Queues will be paused Friday April 21 around 5pm to reduce number of jobs running Sunday afternoon. Jobs running during the Sunday maintenance tryring to access storage will likely fail.

      • 6 April 2017 henry2 logins

          Login attempts to henry2 cluster are currently failing. Fiber channel switch failed resulting in loss of connection to /home and /usr/local. Switch has been replaced and file systems are available again. Login authentication has returned ot normal.
      Copyright © 2019 · Office of Information Technology · NC State University · Raleigh, NC 27695 · Accessibility · Privacy · University Policies