Ph.D Thesis

Ph.D StudentYaniv Jonathan
SubjectJob Scheduling Mechanisms for Cloud Computing
DepartmentDepartment of Computer Science
Supervisor PROF. Joseph Naor
Full Thesis textFull thesis text - English Version


The rise of cloud computing as a leading platform for large-scale computation has introduced a wide range of challenges to the field of scheduling. In the cloud paradigm, users receive on-demand access to shared pools of computing resources, to which they submit applications. The primary goal of a cloud scheduler is to provide and satisfy service level agreements (SLA) for running applications. For example, businesses run production-level jobs that must meet strict deadlines on their completion time. Since the underlying physical resources are limited, the scheduler must decide which service requests to accept in view of the required SLAs; and furthermore, incorporate clever resource allocation algorithms to increase the utility from completed jobs. In general, designing schedulers for the cloud is challenging. First, jobs differ significantly in their value, urgency, structure and resource requirements. Second, job requests are not always known in advance, and therefore the schedulers must make irreversible decisions on-the-fly. Finally, users may attempt to manipulate the scheduler into processing their own jobs in favor of others.

In this dissertation, we develop deadline-aware job scheduling mechanisms for cloud environments. Our main theoretical study begins with the fundamental problem of online deadline scheduling. In its most general form, the problem admits super-constant lower bounds on the competitive ratio of any scheduler. Nevertheless, we develop constant competitive algorithms under a natural deadline slackness assumption, which requires that no job deadline is tight. Moreover, our worst-case bounds improve as the input jobs exhibit large deadline slackness. We then extend the basic allocation model to handle practical aspects of cloud scheduling, such as scheduling jobs with complex inner dependencies, scheduling jobs with uncertain requirements, and providing upfront commitments on SLAs.

Another aspect of our work studies truthful scheduling mechanisms, where users are incentivized not to manipulate the system by falsely reporting job parameters. Such mechanisms are essential not only for public clouds, but also private clouds shared by entities that belong to the same organization. Based on our allocation algorithms, we develop truthful mechanisms and provide performance guarantees similar to their non-truthful counterparts.

Finally, inspired by our theoretical study, we implement a deadline-aware job scheduling algorithm for Hadoop YARN. The algorithm was evaluated against existing benchmark algorithms through extensive experiments, and show substantial improvements over state-of-the-art. The algorithm will be incorporated in the upcoming Hadoop 2.8 distribution.