Back to overview

lpt: a Tool for Tuning the Level of Parallelism of Spark Applications

Type of publication Peer-reviewed
Publikationsform Proceedings (peer-reviewed)
Author RosalesEduardo, RosàAndrea, BinderWalter,
Project Fundamentals of Parallel Programming for Platform-as-a-Service Clouds
Show all

Proceedings (peer-reviewed)

Title of proceedings 25th Asia-Pacific Software Engineering Conference (APSEC)


Spark is increasingly becoming the platform of choice for several big-data analyses mainly due to its fast, fault-tolerant, and in-memory processing model. Despite the popularity and maturity of the Spark framework, tuning Spark applications to achieve high performance remains challenging. In this paper, we present lpt, a novel tool that assists users in improving the level of parallelism of applications running on top of Spark in the local mode. lpt helps users tune the level of parallelism of Spark applications to spawn a number of tasks able to fully exploit the available computing resources. Our evaluation results show that optimizations guided by lpt can achieve speedups up to 2.72x.