Installation

Requirements: OS and Platform

  • Operating system
    • Red Hat Enterprise Linux 4 and 5
    • SuSE Linux Enterprise Server 9 and 10
  • Kernel
    • Linux 2.6, and a higher kernel than 2.6.15
  • Platform
    • x86, IA-64, x86-64 (EM64 and AMD64)
    • PowerPC architectures (for clients only) and mixed-endian clusters
  • Interconnect
    • TCP/IP
    • Quadrics Elan 3 and 4
    • Myri-10G and Myrinet - 2000
    • Mellanox
    • InfiniBand (Voltaire, OpenIB, Silverstorm and any OFEDsupported
    • InfiniBand adapter)

Requirements: Build tools and other utilities

  • kernel-devel (patchless client)
  • kernel-headers
  • perl
  • gcc

Server Installation

  • Install CentOS 5.2 using KickstartInstallation.
    • Add perl and wget in the '%packages' list.
  • Some packages may install latest version of kernel, kernel-headers, or kernel-devel
  • Download lustre 1.8 server rpms from http://www.sun.com/software/products/lustre/get.jsp
    • kernel-lustre-smp-<ver>
    • lustre-modules-<ver>
    • lustre-ldiskfs-<ver>
    • lustre-<ver>
  • Install above packages using
      rpm -ivh <package -name> 
    
  • The e2fsprogs package needs to be updated rather than a fresh install.
      rpm -U e2fsprogs-<ver>
    
  • Reboot system from lustre-kernel (change kernel at boot prompt, for permanent changes edit menu.lst file).

Server Configuration - Mounting MDS and OST nodes

  • Create a combined MGS/MDT file system on the block device. On the MDS node, run:
      mkfs.lustre --reformat --device-size=300000 --fsname lustre --mdt --mgs /tmp/lustre-mdt
    
  • Create a mount point
      mkdir -p /mnt/mds 
    
  • Mount (start) the combined MGS/MDT file system on the block device. On the MDS node, run:
      mount -t lustre -o loop /tmp/lustre-mdt /mnt/mds
    
  • Create an OST node for an OSS. For each OST, run this command on the OSS node
     mkfs.lustre --reformat --device-size=1000000 --fsname lustre --ost --mgsnode=<ip.of.mgs.node>@interface /tmp/lustre-ost1
    
  • Create a mount point
     mkdir -p /mnt/ost1
    
  • Mount the OST. On the OSS node, run:
     mount -t lustre -o loop /tmp/lustre-ost1 /mnt/ost1 
    

Client Installation

  • Install CentOS 5.2 using KickstartInstallation.
    • Add gcc, perl, wget in the '%packages' list.
  • Some packages may install latest version of kernel, kernel-headers, or kernel-devel
  • We need C/C++ compilers for compiling source code. The gcc has a dependency on kernel-headers package. If we install gcc without kernel-headers package, then yum will download latest kernel-headers package by default. It might be good idea to include kernel-headers in the kickstart file. For now install kernel headers version from mirror site
     rpm -ivh  http://ftp.usf.edu/pub/centos/5.2/os/i386/CentOS/kernel-headers-2.6.18-92.el5.i386.rpm 
    
  • Install gcc, make
      yum install gcc make 
    
  • The source code needs to be compiled against the kernel source. Download kernel source from the CentOS mirror site as
     rpm -ivh  http://ftp.usf.edu/pub/centos/5.2/os/i386/CentOS/kernel-devel-2.6.18-92.el5.i686.rpm 
    
  • Download lustre source code from http://www.sun.com/software/products/lustre/get.jsp
  • Extract source code into your home dir.
     tar -xvzf lustre-source.tar.gz 
    
  • Configure lustre source against kernel source. /configure --help will list all configure options.
      cd lustre
     ./configure --with-linux=/usr/src/kernel 
    
  • If configure is successful, do a make and then make install.
     make
     make install 
    

Verify Installation

The lctl utility allows various configuration, maintenance and debugging steps to be performed on lustre file system. Type in 'lctl help' for a list of available commands. It is available on server as well as client side.

  • See a list of node ids / interfaces available
      [root@condor-node2 ~]# lctl list_nids
      10.0.0.24@tcp
    
  • Run 'lctl device_list' command on server side
      [root@localhost ~]# lctl device_list
      0 UP mgs MGS MGS 5
      1 UP mgc MGC10.0.0.42@tcp 0c186ac4-84a7-cf44-3d05-19e9db8cfb54 5
      2 UP mdt MDS MDS_uuid 3
      3 UP lov lustre-mdtlov lustre-mdtlov_UUID 4
      4 UP mds lustre-MDT0000 lustre-MDT0000_UUID 3
      5 UP osc lustre-OST0000-osc lustre-mdtlov_UUID 5
      6 UP ost OSS OSS_uuid 3
      7 UP obdfilter lustre-OST0000 lustre-OST0000_UUID 5
    
  • Ping server from client
     [root@condor-node2 ~]# lctl ping 10.0.0.42
     12345-0@lo
     12345-10.0.0.42@tcp
    
    

Patchless client installation problems

  • Make sure that lustre modules are installed in right place
     $ cd /lib/modules/`uname -r`
     $ find . | grep lustre.ko
     ./kernel/net/lustre/ko2iblnd.ko
     ./kernel/fs/lustre/lustre.ko
    
  • If modules are installed, but 'lctl list_nids' gives error as
    $ lctl list_nids
    opening /dev/lnet failed: No such device
    hint: the kernel modules may not be loaded
    IOC_LIBCFS_GET_NI error 19: No such device
    
    then, you may need to load modules using modprobe
    modprobe -v lustre 
    
  • If it fails,
    FATAL: Module lustre not found.
    
    then try
     depmod
    
  • If you still get errors in 'lctl list_nids' or lsmod doesn't show lustre modules then reboot the system and try reloading modules using modprobe
     modprobe -v lustre
    

Client Configuration

  • Create a mount point
     mkdir -p /mnt/lustre
    
  • Mount the file system on the client.
     mount -t lustre <ip.of.mds.node>@<interface>:/lustre /mnt/lustre
    
  • Verify that file system is working by running dd, df, and ls commands.
     dd if=/dev/zero of=/mnt/lustre/foo bs=1024k count=100
    

Unmounting Server

  • Use umount command to unmount lustre server. Use -d switch to unmount any loop devices. For example to unmount ost node:
     umount -d /mnt/ost1
    
  • The lustre preserves the state of the connected clients if umount is done gracefully. The next time the server is started, it waits for clients to reconnect, and then goes through the recovery procedure. If the -f (“force”) option is given in the umount, then the server evicts all clients and does not go through recovery mode. All previously connected clients need to reconnect to resolve I/O errors in this case. [explore timeout issue].

Modifying file system after initial configuration

Use tunefs.lustre command to modify file system after initial configuration.

More Configuration Options

  • Specifying Failout/Failover mode when an OST becomes unreachable because of failure, network issues, or umount:
    • failout mode: Lustre clients immediately receive errors (EIOs) after a timeout, instead of waiting for the OST to recover
    • failover mode: Lustre clients wait for the OST to recover.

The default is failover mode. To specify failout mode pass failover.mode parameter with a value as 'failout' during mkfs.lustre for OST. For example,

mkfs.lustre --reformat --device-size=1000000 --fsname lustre --ost --mgsnode=<ip.of.mgs.node>@interface --param="failover.mode=failout" /tmp/lustre-ost1 

Reference