Back to overview

TagMatch: Fast Partial Matching for Content Filtering and Routing

English title TagMatch: Fast Partial Matching for Content Filtering and Routing
Applicant Carzaniga Antonio
Number 157164
Funding scheme Project funding (Div. I-III)
Research institution Istituto di sistemi informatici (SYS) Facoltà di scienze informatiche
Institution of higher education Università della Svizzera italiana - USI
Main discipline Information Technology
Start/End 01.02.2015 - 31.01.2018
Approved amount 175'007.00
Show all

Keywords (4)

content-based forwarding; partial matching; information-centric networking; tag-based matching

Lay Summary (Italian)

Le reti "info-centriche" sono basate su un nuovo concetto di indirizzamento secondo il quale la rete fornisce accesso diretto all'informazione piuttosto che ai terminali (o "host"). L'indirizzamento dei terminali è tipico dei sistemi di rete classici in cui prevale la comunicazione tramite connessioni punto-punto tra persone o calcolatori. Indirizzando direttamente i dati si ottiene invece un servizio di comunicazione più adatto ad un interscambio di informazioni più variegato e ricco e tipico delle applicazioni moderne. Esempi di tali applicazioni vanno dal calcolo scientifico al commercio elettronico alle reti sociali alla domotica. Per esempio, un vantaggio è la possibilità di ottenere informazioni senza sapere dove queste risiedano e anche di ottenerle appena vengono prodotte, senza che la sorgente debba conoscere tutte le destinazioni interessate.
Lay summary
Un indirizzamento info-centrico è quello dei "tag".  Per esempio, un'applicazione si può registrare (per conto di un utente) per ricevere informazioni descritte dai tag {music,jazz} cosicché la rete inoltri a quell'applicazione, per esempio, un annuncio descritto dai tag {montreaux, jazz, festival, music, program}.

Ci proponiamo di studiare il problema dell'instradamento per le reti info-centriche.  In particolare il nostro obiettivo è di ottenere sistemi e algoritmi di selezione e instradamento rapido.  Il problema alla radice della selezione e dell'instradamento info-centrico è un problema molto generale di ricerca di sotto-insiemi, che è molto semplice da enunciare ma che sorprendentemente non sembra avere una soluzione altrettanto semplice.  Una soluzione efficiente e pratica a questo problema sarebbe dunque applicabile a molti altri sistemi di filtraggio e smistamento di informazioni.

In questo progetto prima di tutto riconsidereremo i risultati teorici e poi svilupperemo tecniche miste algoritmiche e sistemistiche, in particolare con l'uso di sistemi specializzati quali le unità di elaborazione grafiche ad alto parallelismo (GPU), le unità hardware programmabili (FPGA), e le memorie associative ternarie (TCAMS).
Direct link to Lay Summary Last update: 29.09.2014

Responsible applicant and co-applicants


Name Institute


Performance Annotations for Cloud Computing
RogoraDaniele, SmolkaSteffen, CarzanigaAntonio, DiwanAmer, SouléRobert (2017), Performance Annotations for Cloud Computing, in HotCloud'17: 9th USENIX Workshop on Hot Topics in Cloud Computing, Santa Clara, CA, USAUSENIX Association, Berkeley, CA, USA.
High-Throughput Subset Matching on Commodity GPU-Based Systems
Rogora Daniele, Papalini Michele, Khazaei Koorosh, Margara Alessandro, Carzaniga Antonio, Cugola Gianpaolo (2017), High-Throughput Subset Matching on Commodity GPU-Based Systems, in the Twelfth European Conference, Belgrade, SerbiaACM, New York, NY, USA.
High Throughput Forwarding for ICN with Descriptors and Locators
Papalini Michele, Khazaei Koorosh, Carzaniga Antonio, Rogora Daniele (2016), High Throughput Forwarding for ICN with Descriptors and Locators, in the 2016 Symposium, Santa Clara, California, USAACM, New York, NY, USA.


Group / person Country
Types of collaboration
Alexander L. Wolf, Imperial College London, UK Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure


This proposal is based on the notion of an information centric network in which files, streams, short messages, and even individual data packets are described and thereby addressed through sets of "tags". Routing and forwarding in such a network require an efficient solution of a problem known as partial matching. This problem is also essential to many other applications beyond information centric networking. We propose to revisit the problem from a theoretical perspective and then to study it more extensively from a systems perspective.An information centric network is intended to deliver information of interest to a user--either when the user requests it explicitly (request/reply) or as soon as information becomes available that matches the user's long-term interests (publish/subscribe)--but without requiring that the user specify the location of that information in the network. Thus an information centric network allows users to directly address information rather than network locations.One way to address information is by name, within a flat or hierarchically structured name space. This is the approach taken by many researchers in information centric networking. We propose a more expressive network service in which users describe information. In particular, we propose to use expressive descriptors consisting of sets of tags. Tags are already a very common and intuitive way of annotating and retrieving information in personal databases and on the Web. We propose to adopt them as a form of addresses. For example, if a user is interested in {quantum, physics} and {jazz, music} then the network should forward to that user information tagged as {montreaux, jazz, festival, music, program, 2014}.At the heart of a tag-based information centric network, or in any other information system that selects and disseminates information based on tags or features, is a matching engine that determines which information descriptors match which user requests or interests. We propose to study and develop highly scalable solutions for this matching problem.Our long-term objective is to be able to support tag-based information selection and routing at the scale of the current Internet. This means supporting hundreds of millions of users world-wide at the throughput of current high-end routers, which means hundreds of millions of matching operations per second. Notice, however, that IP forwarding amounts to prefix matching. By contrast, matching tag-based descriptors amounts to subset matching (or partial matching) and is considered a fundamentally more complex problem. Indeed, this complexity is what accounts for the greater expressiveness of tag-based descriptors.Our approach is to first aggregate the sets of tag sets that represent the routing information base, and then to use a staged and highly-parallel architecture for the matcher. Address aggregation is an essential component of IP networking and plays a similar role in tag-based information centric networking. The specific research questions regarding aggregation are whether tag-based descriptors would aggregate well enough to significantly reduce the amount of descriptors that must be processed by the matcher, and then how to effectively perform this aggregation, which again involves partial matching.Therefore, the central contribution of this project will be a highly scalable system for tag-based information centric networking. We propose to integrate general-purpose CPUs with highly-parallel modern GPUs and also FPGAs and TCAMs. The challenge will be to design a distributed matching algorithm across these heterogeneous computing units and then to implement it so as to (1) assign to each unit the most appropriate algorithmic function to best exploit its specific computational capabilities, and to (2) balance the workload induced by the global matching algorithm onto each unit, so as to optimize the overall throughput.This project is an integral part of a broader ongoing effort to realize the innovative idea of an Internet architecture based on the direct addressing of content by both consumers and producers.