Project

Back to overview

Multilevel Scheduling in Large Scale High Performance Computers

Applicant Ciorba Florina
Number 169123
Funding scheme Project funding
Research institution Departement Mathematik und Informatik Universität Basel
Institution of higher education University of Basel - BS
Main discipline Information Technology
Start/End 01.09.2017 - 30.04.2021
Approved amount 374'016.00
Show all

Keywords (9)

parallel and distributed computing; large scale computing; multilevel parallelism; concurrency and data locality; dynamic load balancing; scheduling and load balancing; single-level scheduling; high performance computing; multilevel scheduling

Lay Summary (German)

Lead
Hochleistungsrechnern sind parallele Systeme mit gemeinsamen und verteilten Speicher. Die Anzahl der Recheneinheiten in solchen Systemen hat sich im Laufe der Jahre erhöht und wird weiter steigen. Dies führt in Systemen mit massiven Mengen an Hardware-Parallelismus. Hardware-Parallelismus wird durch Software-Parallelismus ergänzt. Eine gute Übereinstimmung zwischen den Graden und Skalen dieser beiden Arten von Parallelismus auf den verschiedenen Ebenen des Hochleistungsrechnern Ökosystem ist der Schlüssel zur Ausnutzung der Rechenleistung, die von diesen Maschinen geliefert wird.
Lay summary

Inhalte und Forschungsziele

Hardware-Parallelismus reicht von Maschinenanweisungen zu globalen Rechnung-Standorte. Ähnlich reicht die Software-Parallelismus von skalaren Instruktionen zu globalen Job-Warteschlangen. Die Ausnutzung der verfügbaren Hardware-Parallelismus auch auf einer einzigen Ebene ist notorisch schwierig. Dies ist zum Teil aufgrund der Schwierigkeit bei der Aussetzung, der Ausdruck und der Ausnutzung von Parallelismus in den Rechenanwendungen.

Das Projekt wird beantworten: Angesichts der massiven Parallelismus auf mehreren Ebenen und unterschiedlicher Formen und Granularitäten, wie kann es so ausgesetzt, ausgedrückt und ausgenutzt werden, dass Ausführungszeiten reduziert werden, Leistungsziele Erreicht werden und ein akzeptabler Wirkungsgrad beibehalten wird?

Dieses Projekt konzentriert sich auf Scheduling und Lastenausgleich.

In diesem Projekt schlagen wir einen mehrstufigen-Scheduling-Ansatz (MLS) für die Erreichung skalierbare Scheduling in großen Skala Hochleistungsrechnersysteme über die verschiedenen Ebenen der Parallelismus, mit einem Fokus auf Software-Parallelismus.

Der MLS-Ansatz zielt darauf ab, allem verfügbaren Parallelismus zu nutzen und Hardware-Heterogenität zu adressieren in großen Skala Hochleistungsrechnern. Die Methodik zum Erreichen der mehrstufigen-Scheduling Forschungszielen umfasst theoretische Forschungsstudien, Computer Simulationen und Experimente auf Hochleistungsrechnern.

Wissenschaftlicher und sozialer Kontext des Forschungsprojekts

Dieses Projekt nutzt die effizientesten existierenden Scheduling-lösungen, um sie über ein oder zwei Ebenen hinaus zu erweitern und sie innerhalb einzelner Ebenen der Parallelismus zu skalieren.

Das Projekt zielt darauf ab, einen grundsätzlichen Fortschritt in Richtung auf einfachere, groß angelegte Nutzung von Hochleistungsrechnersystemen zu schaffen, mit Auswirkungen nicht nur in der Informatik-Gemeinschaft, sondern auch in allen Bereichen der Computerwissenschaften.

Direct link to Lay Summary Last update: 18.10.2016

Lay Summary (English)

Lead
High performance computers are parallel systems with shared and distributed memory. The number of computing units in such systems increased over the years and will continue to increase in the future. This results in computing systems with massive amounts of hardware parallelism. Hardware parallelism is complemented by software parallelism. A good match between the degrees and scales of these two types of parallelism at the various levels of the high performance computers ecosystem is key in exploiting the computational power delivered by these machines.
Lay summary

Content and research objectives 

Hardware parallelism ranges from machine instructions to global compute sites. Similarly, software parallelism ranges from scalar instructions to global job queues. Exploiting the available hardware parallelism even at a single level is notoriously challenging. This is partly due to difficulty in exposing and expressing parallelism in applications.

The project will answer the question: Given massive parallelism, at multiple levels and of diverse forms and granularities, how can it be exposed, expressed, and exploited such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained?

This project concentrates on scheduling and load balancing.

In this project we propose a multilevel scheduling (MLS) approach for achieving scalable scheduling in large scale high performance computing systems across the multiple levels of parallelism, with a focus on software parallelism.

The MLS approach will leverage all available parallelism and address hardware heterogeneity in large scale high performance computers such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained. The methodology for reaching the multilevel scheduling aims involves theoretical research studies, simulation, and experiments.

Scientific and social context of the research project

This project leverages the most efficient existing scheduling solutions to extend them beyond one or two levels, respectively, and to scale them out within single levels of parallelism.

The project aims to make a fundamental advance toward simpler to use large scale high performance computing systems, with impacts not only in the computer science community but also in all computational science domains.

Direct link to Lay Summary Last update: 18.10.2016

Responsible applicant and co-applicants

Employees

Publications

Publication
A Distributed Chunk Calculation Approach for Self-scheduling of Parallel Applications on Distributed-memory Systems
Eleliemy Ahmed, Ciorba Florina M. (2021), A Distributed Chunk Calculation Approach for Self-scheduling of Parallel Applications on Distributed-memory Systems, in Journal of Computational Science (JOCS2021), 5, 101284.
SimAS: A simulation‐assisted approach for the scheduling algorithm selection under perturbations
Mohammed Ali, Ciorba Florina M. (2020), SimAS: A simulation‐assisted approach for the scheduling algorithm selection under perturbations, in Concurrency and Computation: Practice and Experience, 32(15), e5648.
Two-level Dynamic Load Balancing for High Performance Scientific Applications
Mohammed Ali, Cavelan Aurélien, Ciorba Florina M., Cabezón Rubén M., Banicescu Ioana (2020), Two-level Dynamic Load Balancing for High Performance Scientific Applications, in Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, Philadelphia, PA.
Exploring Loop Scheduling Enhancements in OpenMP: An LLVM Case Study
KasielkeF., TschüterR., VeltenM., CiorbaFlorina M., IwainskyC. (2019), Exploring Loop Scheduling Enhancements in OpenMP: An LLVM Case Study, in the 18th International Symposium on Parallel and Distributed Computing, Amsterdam, NetherlandsIEEE, Amsterdam.
Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach
Eleliemy Ahmed, Ciorba Florina M. (2019), Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach, in 20th International Workshop on Parallel and Distributed Scientific and Engineering Computing , IEEE, Rio de Janeiro, Brazil.
Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access
Eleliemy Ahmed, Ciorba Florina M. (2019), Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access, in The Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), IEEE, Pavia, Italy.
An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M., Franziska Kasielke, Banicescu Ioana (2019), An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems, in Future Generation Computer Systems (FGCS2020), 17.
Finding Neighbors in a Forest: Ab-tree for Smoothed Particle Hydrodynamics Simulations
Cavelan Aurélien, Cabezón Rubén M., Müller Korndörfer Jonas. H., Ciorba Florina M. (2019), Finding Neighbors in a Forest: Ab-tree for Smoothed Particle Hydrodynamics Simulations, in The 2019 Spheric International Workshop, The 2019 Spheric International Workshop, UK.
rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Independent Tasks
Mohammed Ali, Cavelan Aurélien, Ciorba Florina M. (2019), rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Independent Tasks, in International Conference on High Performance Computing & Simulation (HPCS), Dublin, IrelandIEEE, Dublin, Ireland.
Toward A Standard Interface for User-Defined Scheduling in OpenMP
Ciorba Florina M., Kale Vivek, Iwainsky Christian, Klemm Michael, Müller Korndörfer Jonas H. (2019), Toward A Standard Interface for User-Defined Scheduling in OpenMP, in International Workshop on OpenMP (IWOMP), Auckland, New ZealandSpringer, Auckland, New Zealand.
SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations
Mohammed Ali, Ciorba Florina M. (2018), SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations, in Euro-Par 2018: Parallel Processing WorkshopsEuro-Par 2018 International Workshops, Springer International Publishing, Cham.
Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M. (2018), Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments, in The 2018 International Conference on High Performance Computing & Simulation (HPCS 2018), Orléans, FranceIEEE computer society, Orléans, France.
Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M., Kasielke Franziska, Banicescu Ioana (2018), Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications, in The International Symposium on Parallel and Distributed Computing (ISPDC2018), Geneva, SwitzerlandIEEE computer society, Geneva, Switzerland.
OpenMP Loop Scheduling Revisited: Making a Case for More Schedules
Ciorba Florina M., Iwainsky Christian, Buder Patrick (2018), OpenMP Loop Scheduling Revisited: Making a Case for More Schedules, in International Workshop on OpenMP (IWOMP2018), Barcelona, SpainSpringer, Barcelona, Spain.
Efficient Generation of Parallel Spin-images Using Dynamic Loop Scheduling
Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M. (2017), Efficient Generation of Parallel Spin-images Using Dynamic Loop Scheduling, in The 8th International Workshop on Multicore and Multithreaded Architectures and Algorithms HPCC2017, Bangkok, ThailandIEEE computer society, Bangkok, Thailand.
Exploring the Relation Between Two Levels of Scheduling Using a Novel Simulation Approach
Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M. (2017), Exploring the Relation Between Two Levels of Scheduling Using a Novel Simulation Approach, in The IEEE 16th International Symposium on Parallel and Distributed Computing (ISDPC), Innsbruck, AustriaIEEE computer society, Innsbruck, Austria.
An Autonomic Approach for the Selection of Robust Dynamic Loop Scheduling Techniques
Boulmier Anthony, Banicescu Ioana, Ciorba Florina M., Abdennadher Nabil (2017), An Autonomic Approach for the Selection of Robust Dynamic Loop Scheduling Techniques, in International Symposium on Parallel and Distributed Computing (ISPDC2017), Innsbruck, AustriaIEEE, Innsbruck, Austria.
Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications
Hoffeins Franziska, Ciorba Florina M., Banicescu Ioana (2017), Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications, in International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Innsbruck, AustriaIEEE, Innsbruck, Austria.
A Resourceful Coordination Approach for Multilevel Scheduling
Eleliemy Ahmed, Ciorba Florina M., A Resourceful Coordination Approach for Multilevel Scheduling, in International Conference on High Performance Computing & Simulation (HPCS2020), IEEE, Spain.
LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications
Müller Korndörfer Jonas Henrique, Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M., LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications, in Transactions on Parallel and Distributed Systems (TPDS2021)).
Mapping Matters: Application Process Mapping on 3-D Processor Topologies
Müller Korndörfer Jonas Henrique, Bielert Mario, Pilla Laércio L., Florina M. Ciorba, Mapping Matters: Application Process Mapping on 3-D Processor Topologies, in The International Conference on High Performance Computing & Simulation (HPCS2020), Barcelona, SpainIEEE, Barcelona, Spain.

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
European OpenMP Users Conference Talk given at a conference LB4OMP: A Load Balancing Library for OpenMP 02.12.2020 Virtual conference, Great Britain and Northern Ireland Ciorba Florina; Eleliemy Ahmed; Müller Korndörfer Jonas Henrique;
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC20) Individual talk LB4OMP: A Load Balancing Portfolio for OpenMP 17.11.2020 Virtual conference, United States of America Ciorba Florina; Eleliemy Ahmed; Müller Korndörfer Jonas Henrique;
ISC high performance 2020 Poster MLS: Multilevel Scheduling in Large Scale High Performance Computers 22.06.2020 Frankfurt (Digital), Germany Eleliemy Ahmed; Ciorba Florina; Müller Korndörfer Jonas Henrique;
ISC high performance 2020 Poster MLS: Multilevel Scheduling in Large Scale High Performance Computers 16.06.2020 Online conference, Germany Eleliemy Ahmed; Ciorba Florina; Müller Korndörfer Jonas Henrique;
The SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20) Individual talk Robust Dynamic Load Balancing of Scientific Applications Against Perturbations 12.02.2020 Seattle, WA, United States of America Ciorba Florina;
The SIAM Conference on Parallel Processing for Scientific Computing (SIAM PP20) Talk given at a conference Two-level Dynamic Load Balancing for High Performance Scientific Applications 12.02.2020 Seattle, WA, , United States of America Ciorba Florina;
The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC19) Poster A Runtime Approach for Dynamic Load Balancing of OpenMP Parallel Loops in LLVM 22.11.2019 Denver, United States of America Ciorba Florina;
ISC high performance 2019 Poster Multilevel scheduling in large-scale high performance computers 18.06.2019 Frankfurt , Germany Ciorba Florina; Eleliemy Ahmed; Müller Korndörfer Jonas Henrique;
ISC high Performance 2019, the ISC Ph.D. Forum Poster Multilevel Scheduling of Computations in Large-scale Parallel computers 17.06.2019 Frankfurt , Germany Ciorba Florina; Eleliemy Ahmed;
The Platform for Advanced Scientific Computing (PASC19) Conference Poster Identifying Performance Challenges in Smoothed Particle Hydrodynamics Simulations 13.06.2019 Zurich, Switzerland Ciorba Florina;
The Platform for Advancing Scientific Computing (PASC 2019) Conference Poster Identifying Performance Challenges in Smoothed Particle Hydrodynamics Simulations 12.06.2019 Zurich, Switzerland Ciorba Florina;
Leogang 2019 High Performance Computing Workshop Talk given at a conference A look inside well known HPC benchmarks 24.02.2019 Leogang, Austria Ciorba Florina;
Leogang 2019 High Performance Computing Workshop Poster Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access 24.02.2019 Leogang, Austria Eleliemy Ahmed; Ciorba Florina;
OpenMP Loop Scheduling Revisited: Making a Case for More Schedules Talk given at a conference The International Workshop on OpenMP (iWomp 2018), 21.12.2018 Barcelona, Spain Ciorba Florina;
The International Workshop on OpenMP (iWomp 2018) Talk given at a conference OpenMP Loop Scheduling Revisited: Making a Case for More Schedules 21.09.2018 Barcelona, Spain Ciorba Florina;
The International Conference on High Performance Computing & Simulation (HPCS 2018) Talk given at a conference Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments 16.07.2018 Orléans, France Ciorba Florina; Eleliemy Ahmed;
The International Conference on High Performance Computing & Simulation (HPCS 2018) Talk given at a conference Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments 16.07.2018 Orléans, France Ciorba Florina;
The Platform for Advanced Scientific Computing (PASC18) Conference Poster A Study of the Performance of Scientific Applications with Dynamic Loop Scheduling under Perturbations 02.07.2018 Basel, Switzerland Ciorba Florina;
The Platform for Advanced Scientific Computing (PASC18) Conference Poster Dynamic Loop Scheduling Using the MPI Passive-Target Remote Memory Access Model 02.07.2018 Basel, Switzerland Eleliemy Ahmed; Ciorba Florina;
The International Symposium on Parallel and Distributed Computing (ISPDC 2018) Talk given at a conference Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications 27.06.2018 Geneva, Switzerland Ciorba Florina;
The International Symposium on Parallel and Distributed Computing (ISPDC 2018) Talk given at a conference Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications 25.06.2018 Geneva, Switzerland Ciorba Florina; Eleliemy Ahmed;
The International Symposium on Parallel and Distributed Computing (ISPDC 2018) Talk given at a conference Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications 25.06.2018 Geneva, Switzerland Ciorba Florina; Eleliemy Ahmed;
ISC high performance 2019, the ISC Ph.D. Forum Poster Design of Robust Scheduling Methodologies in High Performance Computing 17.06.2018 Frankfurt, Germany Ciorba Florina;
The International Conference for High Performance Computing and Communications (HPCC 2017) Poster Towards the Reproduction of Selected Dynamic Loop Scheduling Experiments Using SimGrid-SimDag 18.12.2017 Bangkok, Thailand Ciorba Florina; Eleliemy Ahmed;
the 8th International Workshop on Multicore and Multithreaded Architectures and Algorithms (M2A2 2017) in conjunction with the 19th IEEE International Conference for High Performance Computing and Communications (HPCC 2017) Talk given at a conference Efficient Generation of Parallel Spin-images Using Dynamic Loop Scheduling 18.12.2017 Bankok, Thailand Eleliemy Ahmed; Ciorba Florina;
The International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) Poster A Methodology for Bridging the Native and Simulated Executions of Parallel Applications 12.11.2017 Denver, Colorado, United States of America Ciorba Florina;


Self-organised

Title Date Place

Communication with the public

Communication Title Media Place Year
New media (web, blogs, podcasts, news feeds etc.) MultiLevel scheduling Project website International 2017

Awards

Title Year
ACM Senior Member Award 2020
Best Paper Award, International Conference on Cluster Computing (Cluster 2019) for the paper entitled “Algorithm-Based Fault Tolerance for Parallel Stencil Computations", co-authored with A. Cavelan 2019

Abstract

High performance computing systems are increasing in size (in terms of node and core count) and diversity (e.g., core types per node), leading to an increase in their available parallelism. Hardware parallelism can be found at several levels, from machine instructions to global compute sites. This results in several corresponding levels of software parallelism, from scalar instructions to global job queues. Unfortunately, exploiting the available hardware parallelism even at a single level is notoriously challenging, in part due to difficulty in exposing and expressing parallelism in the computational applications. Exposing, expressing, and exploiting parallelism is even more difficult when considering the increase in parallelism within each level and when considering more than a single or a couple of parallelism levels. Scheduling and load balancing are vital parts of any successful effort of coordinating and managing parallelism in high performance computing.This project proposes to investigate and develop multilevel scheduling (MLS), a multilevel approach for achieving scalable scheduling in large scale high performance computing systems across the multiple levels of parallelism, with a focus on software parallelism. By integrating multiple levels of parallelism, MLS differs from hierarchical scheduling, traditionally employed to achieve scalability within a single level of parallelism. MLS is based on extending and bridging the most successful (batch, application, and thread) scheduling models beyond single or a couple of parallelism levels (scaling across) and beyond their current scale (scaling out).The proposed MLS approach aims to leverage all available parallelism and address hardware heterogeneity in large scale high performance computers such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained. The methodology for reaching the multilevel scheduling aims involves theoretical research studies, simulation, and experiments.The expected outcome is an answer to the following research question: Given massive parallelism, at multiple levels, and of diverse forms and granularities, how can it be exposed, expressed, and exploited such that execution times are reduced, performance targets (e.g., robustness against perturbations) are achieved, and acceptable efficiency (e.g., tradeoff between maximizing parallelism and minimizing cost) is maintained? This proposal leverages the most efficient existing scheduling solutions to extend them beyond one or two levels, respectively, and to scale them out within single levels of parallelism. The proposal addresses four tightly coupled problems: scalable scheduling, adaptive and dynamic scheduling, heterogeneous scheduling, and bridging schedulers designed for competitive execution (e.g., batch and operating system schedulers) with those for cooperative execution (e.g., application level schedulers).Overall, the project aims to make a fundamental advance toward simpler to use large scale high performance computing systems, with impacts not only in the computer science community but also in all computational science domains.
-