Last modified 6 years ago Last modified on 01/14/14 14:14:11

DevOps Weekly Meeting | January 14, 2014

Time & Location: 9:30am-10:45am in LHL164


jpr, mhanby, billb, tanthony

2014-01-09 Agenda

  • MATLAB 2014a availability
  • Options for moving HPC from ROCKS to in-house provisioning
  • Outline soft of move RCS compute fabric "cheaha" and related services from BEC to 936 building


  • MATLAB 2014a is available but no longer supports CentOS5 which prevents it from being used on Cheaha which is currently based on ROCKS5 and CentOS5.
  • Compared cluster provisioning choices and updates on general approaches. Timely considering the platform issue with MATLAB 2014a.
  • In depth discussion of soft migration to 936 building, the goal being to at minimum retain access for researchers to storage and just have compute services shut down. Additional FY14 compute hardware acquisitions may enable a simple degraded compute mode during migration.


MATLAB 2014a

The first 2014 release of MATLAB, 2014a, is available. This release no longer supports CentOS5 which prevents it from being used on Cheaha which is currently based on ROCKS5 and CentOS5. tanthony will review the release to see what features have been added to get a sense for how much near term pressure may exist to support the latest release on the cluster. This is in addition to the license manager migration since the 2013b release, which prevents 2013b from being supported on the cluster or by any lab that uses the floating license manager

HPC Cluster Provisioning Options

There have been plans to upgrade from ROCKS5 to ROCKS6 to get the cluster to CentOS6. Because of the general disruption this causes, the need to recompile software, and compatibility with older hardware, it's easy to put off and wait for new hardware. We've been pulling apart our ROCKS5 install for some time so that not all services are provided by the head node (NAS for homedirs and storage, Luster/DDN for scratch, etc) but have never completely pulled it all apart; SGE, Web/Ganglia, 411 and hardware provisioning still come from ROCKS and run on the head node.

The problem with ROCKS is that it tends to age quickly because CentOS and other enterprise derivatives of RedHat more conservatively curate their package selection. It may be good to move away from ROCKS completely, or rather, to make the HPC OS profile easier to customize. This fits well with our cloud-provisioned-HPC and compile-to-hardware use-cases that have motivated much of the RCS system design. Our goal is to make it easy for researchers to create their data analysis environment and then scale it across nodes.

At it's core ROCKS assumes a two-network topology with the gateway node providing all cluster server services and using PXE boot on the internal LAN to auto provision all other nodes . The RCS provisioning with kickstart+Puppet has been our target for this migration away from ROCKS. Since we still rely on ROCKS for provisioning compute nodes, we need a node discovery and imaging service.

We're moving to Puppet3 in part to use tools like The Forman. There are other provisioning tools. The OpenStackPlusCeph fabric uses Crowbar and other choices exist. Related tools like Vagrant and Docker fit into this puzzle and are redefining modern systems development. CIS bases it's clusters and systems on Debian using diskless remote boot linux for the PXE component and either use clonezilla for imaging or an automated Debian install using a pre-seed file (akin to kickstart used by RedHat systems).

HPC Cluster Migration to 936 Building

The Shared HPC Compute facility in BEC is being decommissioned. This requires moving the Research Computing System hardware that provides the HPC compute fabric (a.k.a. Cheaha's head node and compute nodes), Lustre storage (MDS and OSS[1-3]), RCS firewall fabric, and supporting VMs (e.g. Nagios) from BEC to new data center space in 936. We would like to accomplish this with as little operational disruption. At worst, Cheaha would still be accessible (user access to file systems) just not running any HPC jobs. If possible, we will coordinate any new compute hardware purchases to pre-seed the 936 data center space and then simply enter a reduce compute capacity mode while the gen3 hardware is moved from BEC to 936.

In 2013Q4 we installed login-01 and login-02, two new nodes in RUST (Intel 16-core, 96GB RAM) to help us scale our login node capacity on the cluster and integrate with the OpenStack nova-compute fabric. We will leverage these nodes extensively during the 936 move. We will leap-frog through these nodes to move services out of BEC and over to 936.

The major items to address and outline of the migration:

  • Networking
    • Extend 10G research network fabric to 936
    • Includes the supporting VLANs for many of the RCS internal networks
    • May want to leverage a Froce10 switch acquisition to expand 10G connectivity in RUST and then have a second switch for the Ceph fabric
  • Luster
    • Migrate Luster access away from Infiniband to Ethernet to prep
    • Acquire new hardware for one MDS and three OSS nodes in RUST to replace ageing DDN front-end nodes in BEC
      • These nodes will serve Luster access via 10G network during move
      • This will help us keep storage on line and move compute nodes between building
    • Move OSS nodes by introducing new OSS services in RUST and then decommissioning OSS servers in BEC.
    • Move MDS
      • This is the trickiest to move because it will require at least a short service interruption to move the file system image.
    • Fibre channel to 936 to support DDN access via Infiniband after the move.
  • Head node
    • Virtualize the head node on KVM fabric of login-01 and login-02
  • RCS Firewalls and VMs
    • Most of our non-Rocks services are run as KVM hosted virutal machines and made available through Linux firewall fabric using Firewall Builder running on two nodes in BEC call rcs-srv-01 and rcs-srv-02.
    • Extend Firewall Builder fabric to include login-01 and login-02 in RUST; make login-01 primary
    • Move KVM VMs (e.g. Nagios, Puppet Master, Robinhood and others) to login-01 and login-02
    • Decomission rcs-srv-01 and rcs-srv-02
  • Ceph Access
    • Technically not part of 936 migration, but the new research storage Ceph block devices will be served off of login-01 and login-02 during the time the move is taking place.