parallel and distributed computing; large scale computing; multilevel parallelism; concurrency and data locality; dynamic load balancing; scheduling and load balancing; single-level scheduling; high performance computing; multilevel scheduling
Eleliemy Ahmed, Ciorba Florina M. (2021), A Distributed Chunk Calculation Approach for Self-scheduling of Parallel Applications on Distributed-memory Systems, in
Journal of Computational Science (JOCS2021), 5, 101284.
Mohammed Ali, Ciorba Florina M. (2020), SimAS: A simulation‐assisted approach for the scheduling algorithm selection under perturbations, in
Concurrency and Computation: Practice and Experience, 32(15), e5648.
Mohammed Ali, Cavelan Aurélien, Ciorba Florina M., Cabezón Rubén M., Banicescu Ioana (2020), Two-level Dynamic Load Balancing for High Performance Scientific Applications, in
Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, Society for Industrial and Applied Mathematics, Philadelphia, PA.
KasielkeF., TschüterR., VeltenM., CiorbaFlorina M., IwainskyC. (2019), Exploring Loop Scheduling Enhancements in OpenMP: An LLVM Case Study, in
the 18th International Symposium on Parallel and Distributed Computing, Amsterdam, NetherlandsIEEE, Amsterdam.
Eleliemy Ahmed, Ciorba Florina M. (2019), Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach, in
20th International Workshop on Parallel and Distributed Scientific and Engineering Computing , IEEE, Rio de Janeiro, Brazil.
Eleliemy Ahmed, Ciorba Florina M. (2019), Dynamic Loop Scheduling Using MPI Passive-Target Remote Memory Access, in
The Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), IEEE, Pavia, Italy.
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M., Franziska Kasielke, Banicescu Ioana (2019), An Approach for Realistically Simulating the Performance of Scientific Applications on High Performance Computing Systems, in
Future Generation Computer Systems (FGCS2020), 17.
Cavelan Aurélien, Cabezón Rubén M., Müller Korndörfer Jonas. H., Ciorba Florina M. (2019), Finding Neighbors in a Forest: Ab-tree for Smoothed Particle Hydrodynamics Simulations, in
The 2019 Spheric International Workshop, The 2019 Spheric International Workshop, UK.
Mohammed Ali, Cavelan Aurélien, Ciorba Florina M. (2019), rDLB: A Novel Approach for Robust Dynamic Load Balancing of Scientific Applications with Independent Tasks, in
International Conference on High Performance Computing & Simulation (HPCS), Dublin, IrelandIEEE, Dublin, Ireland.
Ciorba Florina M., Kale Vivek, Iwainsky Christian, Klemm Michael, Müller Korndörfer Jonas H. (2019), Toward A Standard Interface for User-Defined Scheduling in OpenMP, in
International Workshop on OpenMP (IWOMP), Auckland, New ZealandSpringer, Auckland, New Zealand.
Mohammed Ali, Ciorba Florina M. (2018), SiL: An Approach for Adjusting Applications to Heterogeneous Systems Under Perturbations, in
Euro-Par 2018: Parallel Processing WorkshopsEuro-Par 2018 International Workshops, Springer International Publishing, Cham.
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M. (2018), Performance Reproduction and Prediction of Selected Dynamic Loop Scheduling Experiments, in
The 2018 International Conference on High Performance Computing & Simulation (HPCS 2018), Orléans, FranceIEEE computer society, Orléans, France.
Mohammed Ali, Eleliemy Ahmed, Ciorba Florina M., Kasielke Franziska, Banicescu Ioana (2018), Experimental Verification and Analysis of Dynamic Loop Scheduling in Scientific Applications, in
The International Symposium on Parallel and Distributed Computing (ISPDC2018), Geneva, SwitzerlandIEEE computer society, Geneva, Switzerland.
Ciorba Florina M., Iwainsky Christian, Buder Patrick (2018), OpenMP Loop Scheduling Revisited: Making a Case for More Schedules, in
International Workshop on OpenMP (IWOMP2018), Barcelona, SpainSpringer, Barcelona, Spain.
Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M. (2017), Efficient Generation of Parallel Spin-images Using Dynamic Loop Scheduling, in
The 8th International Workshop on Multicore and Multithreaded Architectures and Algorithms HPCC2017, Bangkok, ThailandIEEE computer society, Bangkok, Thailand.
Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M. (2017), Exploring the Relation Between Two Levels of Scheduling Using a Novel Simulation Approach, in
The IEEE 16th International Symposium on Parallel and Distributed Computing (ISDPC), Innsbruck, AustriaIEEE computer society, Innsbruck, Austria.
Boulmier Anthony, Banicescu Ioana, Ciorba Florina M., Abdennadher Nabil (2017), An Autonomic Approach for the Selection of Robust Dynamic Loop Scheduling Techniques, in
International Symposium on Parallel and Distributed Computing (ISPDC2017), Innsbruck, AustriaIEEE, Innsbruck, Austria.
Hoffeins Franziska, Ciorba Florina M., Banicescu Ioana (2017), Examining the Reproducibility of Using Dynamic Loop Scheduling Techniques in Scientific Applications, in
International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Innsbruck, AustriaIEEE, Innsbruck, Austria.
Eleliemy Ahmed, Ciorba Florina M., A Resourceful Coordination Approach for Multilevel Scheduling, in
International Conference on High Performance Computing & Simulation (HPCS2020), IEEE, Spain.
Müller Korndörfer Jonas Henrique, Eleliemy Ahmed, Mohammed Ali, Ciorba Florina M., LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications, in
Transactions on Parallel and Distributed Systems (TPDS2021)).
Müller Korndörfer Jonas Henrique, Bielert Mario, Pilla Laércio L., Florina M. Ciorba, Mapping Matters: Application Process Mapping on 3-D Processor Topologies, in
The International Conference on High Performance Computing & Simulation (HPCS2020), Barcelona, SpainIEEE, Barcelona, Spain.
High performance computing systems are increasing in size (in terms of node and core count) and diversity (e.g., core types per node), leading to an increase in their available parallelism. Hardware parallelism can be found at several levels, from machine instructions to global compute sites. This results in several corresponding levels of software parallelism, from scalar instructions to global job queues. Unfortunately, exploiting the available hardware parallelism even at a single level is notoriously challenging, in part due to difficulty in exposing and expressing parallelism in the computational applications. Exposing, expressing, and exploiting parallelism is even more difficult when considering the increase in parallelism within each level and when considering more than a single or a couple of parallelism levels. Scheduling and load balancing are vital parts of any successful effort of coordinating and managing parallelism in high performance computing.This project proposes to investigate and develop multilevel scheduling (MLS), a multilevel approach for achieving scalable scheduling in large scale high performance computing systems across the multiple levels of parallelism, with a focus on software parallelism. By integrating multiple levels of parallelism, MLS differs from hierarchical scheduling, traditionally employed to achieve scalability within a single level of parallelism. MLS is based on extending and bridging the most successful (batch, application, and thread) scheduling models beyond single or a couple of parallelism levels (scaling across) and beyond their current scale (scaling out).The proposed MLS approach aims to leverage all available parallelism and address hardware heterogeneity in large scale high performance computers such that execution times are reduced, performance targets are achieved, and acceptable efficiency is maintained. The methodology for reaching the multilevel scheduling aims involves theoretical research studies, simulation, and experiments.The expected outcome is an answer to the following research question: Given massive parallelism, at multiple levels, and of diverse forms and granularities, how can it be exposed, expressed, and exploited such that execution times are reduced, performance targets (e.g., robustness against perturbations) are achieved, and acceptable efficiency (e.g., tradeoff between maximizing parallelism and minimizing cost) is maintained? This proposal leverages the most efficient existing scheduling solutions to extend them beyond one or two levels, respectively, and to scale them out within single levels of parallelism. The proposal addresses four tightly coupled problems: scalable scheduling, adaptive and dynamic scheduling, heterogeneous scheduling, and bridging schedulers designed for competitive execution (e.g., batch and operating system schedulers) with those for cooperative execution (e.g., application level schedulers).Overall, the project aims to make a fundamental advance toward simpler to use large scale high performance computing systems, with impacts not only in the computer science community but also in all computational science domains.