This page is intended to document Lustre exploration and make notes about things learned in the process. It may seem like a dump space initially, but this stuff needs to be documented somewhere. This place seemed to be a good start. I also intend to put together any questions that we might be particularly interested during Lustre training session.
A good place to start about Lustre would be our introduction. It provides brief introduction and links to other Lustre related pages. The LustreQuickStart page shows a simple Lustre installation with Lustre servers and patchless Lustre client.
Pre-reqs
- Lustre support matrix: lists supported OS / kernels.
MGS
- Orchestrates between clients and servers
- The Lustre manual mentions having only one MGS per site, however see this thread for exceptions and more information.
- The MGS and MDS are typically combined in many installations. Although, this seems good idea
MDS
- Unique per file system
- 1-2% of file system capacity
- Requires more CPU power
- RAID 0+1 recommended
- Failover configuration
OSS
Mounting file system
- File system name – 8 characters
- Device name – 8 characters
- Clients can use file systems with longer names (mount point)
- Mounting file system starts service
- Use -o nosvc to mount without starting service
- -o and -d (loop device) can't be used together?
Unmounting
- umount stops service
- d - used while unmounting loop devices
- f - unmount without recovery enabled
Failover configuration
- Lustre failover can be used to support failure of a node/server, but not the storage/target.
Data redundancy
- No redundancy support provided by lustre itself.
- See related threads: OST redundancy between nodes and redundancy with object storage
