Cloud Scheduler

A cloud-enabled distributed resource manager backend

Cloud Scheduler manages virtual machines on clouds like Nimbus, OpenNebula, Eucalyptus or EC2 to create an environment for batch job execution.

Users submit their jobs to a batch job queue like Condor, Sun Grid Engine, or Platform LSF, and Cloud Scheduler boots VMs to suit those jobs.

How Cloud Scheduler Fits into the Cloud / HPC Ecosystem

Cloud Scheduler

How a User Interacts with Cloud Scheduler

For the most part, a user doesn't interact with Cloud Scheduler at all. Everything should be automatic. Here's how it should work for the user:

  1. Jane prepares a VM image loaded with the software she needs for processing, then uploads it to an image repository. It's also possible that this could have been done previously by one of her colleagues, or she picks a pre-cooked image.
  2. Jane submits a bunch of processing jobs to a Condor pool. In the Condor jobs, she specifies regular Condor parameters, but also specifies a VM image that she would like her job to run on.
  3. Jane then waits for her jobs to complete.
  4. Jane gets her results.
So to the user, the only difference from a traditional batch-queue system should be that she creates an image, and specifies it in her job description.

So What Does Cloud Scheduler Do Again?

Cloud Scheduler acts after step 2 above. It looks at the job queue to discover which VM images are needed to complete the jobs in the queue, boots some VM images on the clusters it has access to. These VM images run the jobs from the queue, and Cloud Scheduler then shuts them down when they're no longer neccessary.

We aim to support Eucalyptus, OpenNebula, Nimbus, and Amazon EC2 on the backend.

Who Makes This?

The University of Victoria High Energy Physics Grid Computing group (HEP-GC), along with the CANFAR project, and NRC-Sussex in Ottawa develop Cloud Scheduler. It will be used in CANFAR and in the HEP Legacy Data Project, both of which are NEP projects funded by CANARIE.

if you're interested in knowing more.

Where are We Now?

Right now, we're moving from the proof of concept phase to something that someone might actually use.

Source

We keep the source on GitHub. Feel free to take a look at it. It's GPLv3 and Apache v2 dual-licensed.