Understanding GridWay
GridWay is a meta-scheduler that can manage jobs sumbitted to grid computing node via the Globus Tookit interfaces.
One aspect of job submission and management is to accept a job and run it under the correct account context to ensure application isolation, local security, and account for resource utilization.
GridWay achieves this using Sudo, the standard Unix privilege assignment tool. Sudo is a powerful system that can itself be complex to configure. These notes are an effort at better understanding the needs and getting the configuration correct.
Official Documentation Overview
Installation & Administration
The GridWay Installation and Administration Guide is a good place to get started with the GridWay experience, even for users. While most of the document is dedicated installation and configuration issues that aren't critical to get started using UABgrid stage, Chapter 1 and the first to sections of Chapter 2 offer a good overview of the architecture of GridWay and what problem space it addresses.
An important taxonomy that !GridWay documents is the concept of department, enterprise, and partner grids. These apply to the three grid domains we interface with:
- Department Grid - local HPC systems like Cheaha, Coosa, and Olympus
- Enterprise Grid - essentially UABgrid
- Partner Grids - SURAgrid, TeraGrid, etc
Additionally, it is important to familiarize yourself with the GridWay Architecture. The following image is excerpted from that document given it's importance: http://www.gridway.org/documentation/stable/images/gw_arch.jpg
The administration guide essentially has two main components outside the install documentation:
- Scheduler Configuration - describes how to control the scheduling policies for the GridWay install including how to develop custom schedulers.
- Middleware Access Driver (MAD) Configuration - describes how to control the interfaces to and from external systems. These are essentially the interfaces that feed in information about the available resources and that are used to submit jobs to remote resource and transfer data to and from those systems.
User Documentation
The User Documentation is clearly geared toward users. But one of the most important elements to draw your attention to is the GridWay job life-cyle. This clarifies the states a job can take.
