Magrathea
Magrathea provides an interface between batch scheduling system(s) and virtual machine monitors. It allows jobs to be submitted in a controlled manner into virtual machines running on cluster nodes (when several virtual machines are running on a single cluster node). Magrathea can also be seen as a scheduler running on each cluster node which schedules virtual machines according to jobs submitted into them by batch scheduling systems. Current implementation supports PBS Pro scheduling system, which has to be slightly modified to support all features of Magrathea, and Xen and VServer virtual machine monitors.
Further information about Magrathea may also be found in CESNET technical report 25/2007.
The name comes from The Hitchhiker's Guide to the Galaxy written by Douglas Adams. Magrathea is a planet where other planets, such as Earth, are built for wealthy-enough creatures, such as mice.
Architecture
Magrathea consists of three main components: master process representing physical machines, slave processes running in each virtual machine and optional cache process, storing information about status of all virtual machines running on a cluster. Architecture of Magrathea and interaction with resource management system and virtual machine monitor is depicted in the following figure:
Master
Runs on each real machine and manages all virtual worker nodes running on the same machine and stores their current status at the status cache. The status cache is updated whenever the status is changed and even when no change occurs within a predefined time period.
Slave
Runs on each virtual worker node and asks its master daemon for permission to start a new job, informs the master about finished jobs and behaves according to commands received from the master.
Cache
Stores status data about all virtual worker nodes so that a batch scheduling system does not have to contact each node to check its current status before making a decision where to execute a job. This part is not required when a scheduling system is able to get status information directly from computational nodes. However, the status cache is a more scalable solution.
Source Code
Source code of Magrathea with administrator's documentation is available as "magrathea" module in METACenter CVS repository.