Service Level Agreement Based Scheduling

Many Grid Computing scenarios involve jobs with complex workflows which are somehow mapped down onto computational resources; the user is typically expected to have some assurances about when the job will run. Unfortunately, computational resources are typically controlled by batch queue systems, which don't offer any guarantees (or any other information) about run-times. The exception is when advance reservation is used. However, advance reservation adversely affects resource utilisation, and therefore the resource owner's income, and so is undesirable. Further, while being sufficient for most scenarios, it is far from necessary.

We propose an alternative approach which aims to explore the space between the two aforementioned extreme levels of service, namely, `run this job whenever it gets to the head of the queue', or `run this job at this precise time'. The main idea is to provide different levels of service by forging agreements between the different parties (user, resource owner, etc). Such agreements are agreed on the basis of different constraints expressed by (and agreed between) the user and/or the resource owner and essentially specify a desired (and agreed) level of service. The use of Service Level Agreements (SLAs) gives rise to a fundamentally new approach for job scheduling on the Grid.

This is work carried out in the context of an EPSRC funded project, which started in January 2004 and lasted for 4 years.