Back to overview

Demystifying Casualties of Evictions in Big Data Priority Scheduling

Type of publication Peer-reviewed
Publikationsform Original article (peer-reviewed)
Author Rosà Andrea, Chen Lydia Y., Birke Robert, Binder Walter,
Project LoadOpt - Workload Characterization and Optimization for Multicore Systems
Show all

Original article (peer-reviewed)

Journal SIGMETRICS Perform. Eval. Rev.
Volume (Issue) 42(4)
Page(s) 12 - 21
Title of proceedings SIGMETRICS Perform. Eval. Rev.
DOI 10.1145/2788402.2788406


The ever increasing size and complexity of large-scale datacenters enhance the difficulty of developing efficient scheduling policies for big data systems, where priority scheduling is often employed to guarantee the allocation of system resources to high priority tasks, at the cost of task preemption and resulting resource waste. A large number of related studies focuses on understanding workloads and their performance impact on such systems; nevertheless, existing works pay little attention on evicted tasks, their characteristics, and the resulting impairment on the system performance. In this paper, we base our analysis on Google cluster traces, where tasks can experience three different types of unsuccessful events, namely eviction, kill and fail. We particularly focus on eviction events, i.e., preemption of task execution due to higher priority tasks, and rigorously quantify their performance drawbacks, in terms of wasted machine time and resources, with particular focus on priority. Motivated by the high dependency of eviction on underlying scheduling policies, we also study its statistical patterns and its dependency on other types of unsuccessful events. Moreover, by considering co-executed tasks and system load, we deepen the knowledge on priority scheduling, showing how priority and machine utilization affect the eviction process and related tasks.