Project

Back to overview

FORWARD (Fine-grained netwORk floW behAvior pReDiction)

English title FORWARD (Fine-grained netwORk floW behAvior pReDiction)
Applicant Eugster Patrick
Number 192121
Funding scheme Project funding (Div. I-III)
Research institution Istituto di sistemi informatici (SYS) Facoltà di scienze informatiche
Institution of higher education Università della Svizzera italiana - USI
Main discipline Information Technology
Start/End 01.04.2020 - 31.03.2024
Approved amount 872'637.00
Show all

Keywords (6)

congestion control; monitoring; Kalman filtering; network flow prediction; traffic engineering; datacenter networks

Lay Summary (German)

Lead
Eine allgemeine Herausforderung für Computer Netzwerke ist dass die genaue Last denen diese unterworfen sind und deren Verteilung nicht im Voraus bekannt ist. Dies macht es schwierig im Fall von lokalen Überlastungen zeitgerecht Maßnahmen zu ergreifen um den Verlust von übermittelten Daten zu vermeiden (z.Bsp. durch Umleitung bestimmter Datenflüsse). Dies führt zu Leistungseinbußen da entsprechend verlorene Daten erneut übertragen werden müssen. Neuere Ansätze versuchen schnellstmöglichst zu reagieren indem sie die Engpässe in Netzwerken erkennen und versuchen entsprechende Flüsse zu bremsen. Reaktive Ansätze hinken jedoch dem Zustand der Dinge immer hinterher, was dazu führen durch unangemessene Reaktionen die Situation weiter zu verschlechtern.
Lay summary

Inhalt und Ziel des Forschungsprojekts

Das Ziel dieses Projektes besteht darin Netzwerke fähig zu machen Engpässe vorauszusehen und entsprechend schon Maßnahmen einzuleiten bevor überhaupt überhaupt Datenverluste eintreten. Im Detail werden wir (1) spezifische Techniken für maschinelles Lernen entwickeln die genauestens auf die Eigenschaften von Netzwerkflüssen abgestimmt sind und die Entwicklung deren Bandbreite-Bedürfnisse vorhersehen kann. Weiter werden wir (2) eine Software-Infrastruktur entwickeln die es erlaubt für unsere Techniken relevante Informationen über den Zustand von Netzwerkflüsse in Datenzentren zu Laufzeit abzugreifen ohne deren Übertragung zu beeinflussen. Letztendlich schlagen wir (3) verschiedene Ansätze vor um die Prädiktionen umzusetzen, als Teil von neuartigen Ansätzen zu sogenanntem Congestion Control oder Traffic Engineering. 

Wissenschaftlicher und gesellschaftlicher Kontext des Forschungsprojekts

Computer Netzwerke bilden das Rückgrat für unsere heutige Informations-gesteuerte Gesellschaft. Die Mehrzahl von Web Anwendungen und Diensten benutzen Computerresourcen in Datenzentren. Entsprechend ist es von großer Bedeutung die Nutzung von Netzwerken in Datenzentren zu verbessern. Verbesserungen führen nicht nur zu besserer Leistung von Anwendungen sondern auch zu einem verbessertem Verhältnis zwischen Energieverbrauch und Leistung.


Direct link to Lay Summary Last update: 29.03.2020

Responsible applicant and co-applicants

Employees

Project partner

Associated projects

Number Title Start Funding scheme
197353 BASIS (hyBrid Asynchronous/Synchronous dIstributed Systems) 01.03.2021 Project funding (Div. I-III)

Abstract

One main challenge for the design of networks is that traffic load is not generally known in advance. This makes it hard to adequately devote resources such as to best prevent or mitigate bottlenecks. Take today’s datacenters (DCs), which execute a variety of applications. DC networks experience congestion due to the bursty nature of certain network traffic, with multiple flows transmitted on the same link producing high peaks simultaneously. Increasing network bandwidth, nowadays up to 100Gb/s, does not eradicate the problem, as many so-called elephant flows in low bandwidth networks result from peaks which are “flattened”; in high bandwidth networks these retain their original (bursty) nature. Flow completion times under congestion are unlikely to decrease with increased bandwidth when relying on reactive congestion control mechanisms (e.g., TCP, DCTCP, PCC), as these incur further communication between sender and receiver and thus delays in reacting to congestion. Recent protocols attempting to avoid congestion upfront for lossless communication (cf. DCQCN, TIMELY), as required for remote direct memory access (RDMA), are a step in the right direction - from reactive to proactive. Yet, proactive protocols which “live in the present” will always be lagging. Ideally a network would be capable of predicting the evolution of bandwidth requirements of traffic flows so that an impending over-utilization would be recognized early enough to, e.g., throttle flows or re-assign them to other paths before congestion and packet loss occur. Previous research on such network flow (size evolution) prediction however only handles traffic data aggregated in time and space - considering flows over long periods of time, combining many such flows, and predicting for the same aggregated flows. Since most congestion is caused by the interaction of short traffic bursts from elephant flows, such coarse-grained predictions cannot usefully predict bursts in individual traffic flows.This project dubbed FORWARD (fine-grained network flow behavior prediction) is concerned thus with fine-grained per-flow prediction, considered so far to be unsolved/unsolvable. More precisely, we are interested in prediction [indiv] at the level of individual flows to enable adequate adaptations especially under congestion; [scale] for large numbers of flows simultaneously to scale to entire net- works; [lead] with suffficient lead-time to take corrective actions; [accur] with high accuracy; [impl] implementable on commodity switches without additional specialized hardware. Moreover, we are interested in concretely applying this fine-grained prediction to protocols for congestion control as well as traffic engineering. To do so while achieving the above-mentioned goals, we first propose to design and implement a novel infrastructure for network monitoring and management that can execute logic used for the collection of features relevant to prediction, run prediction algorithms, and take corresponding actions “as close as possible” to the switching hardware at the necessary rate, in a globally optimized manner. Based on this distributed infrastructure, we then investigate the application of our prediction to congestion control and traffic engineering, by extension of existing protocols and conception of novel ones. In summary, this project concretely strives to address the following hard research questions: (1) How to predict behavior of individual network flows with fine granularity? (2) How to monitor and manage network flows with low latency and high accuracy? (2) How to improve congestion control and traffic engineering with network flow prediction?
-