Project

Back to overview

DM3 : Distributed MultiModal Media server, a low cost large capacity high throughput data storage system

English title DM3 : Distributed MultiModal Media server, a low cost large capacity high throughput data storage system
Applicant Bourlard Hervé
Number 128771
Funding scheme R'EQUIP
Research institution IDIAP Institut de Recherche
Institution of higher education Idiap Research Institute - IDIAP
Main discipline Information Technology
Start/End 01.03.2010 - 28.02.2011
Approved amount 214'000.00
Show all

Keywords (7)

high bandwidth; distributed storage; high capacity storage; multimodal data; storage; high performance; Lustre

Lay Summary (English)

Lead
Lay summary
The Idiap Research Institute has been active for almost two decades in Swiss, European and Worldwide projects and consortiums where the acquisition, storage, distribution, processing, and evaluation of very large amounts of digital data was playing a central role in the execution and success of the research. To name but a few, Idiap has been and still is leading or involved in: * the NCCR IM2 on Interactive Multimodal Information Management; * the AMI and AMIDA EU FP6 projects; * the MOBIO FP7 project; * the DARPA/ GALE US project.Efficient access to the very large data sets involved is of paramount importance for Idiap to maintain its leadership in those projects.The Distributed MultiModal Media Server (R'Equip) project thus aims at adding a high-capacity and high-bandwidth storage system to Idiap research infrastructure using Sun Microsystem's Lustre technology (see http://lustre.org).Lustre is an open source file system which features an architecture scalable to petabytes of capacity and gigabytes per second of I/O bandwidth, thanks to parallel access to distributed storage nodes. It benefits from a successful history in High Performance Computing and among the Top500 Supercomputing Sites in the world (see http://top500.org).The planned implementation will allow Idiap to extend its storage capacity by several hundred terabytes along with an aggregated bandwidth of several gigabytes per second and offer a leading-edge storage system for its research.The Distributed MultiModal Media Server (R'Equip) project is funded by the Swiss National Science Foundation (see http://www.snf.ch).
Direct link to Lay Summary Last update: 21.02.2013

Responsible applicant and co-applicants

Associated projects

Number Title Start Funding scheme
111401 NCCR IM2: Interactive Multimodal Information Management (phase II) 01.01.2006 National Centres of Competence in Research (NCCRs)
122062 MULTI: Multimodal Interaction and Multimedia Data Mining 01.10.2008 Project funding (Div. I-III)

Abstract

The Idiap Research Institute has been active for almost two decades in Swiss, European and Worldwide projects and consortiums where the acquisition, storage, distribution, processing, and evaluation of very large amounts of digital data was playing a central role in the execution and success of the research. To name but a few, Idiap has been and still is leading or involved in:•the NCCR IM2 on Interactive Multimodal Information Management;•the AMI and AMIDA EU FP6 projects;•the MOBIO FP7 project;•the DARPA/ GALE US project.The need for efficient access to very large data sets in each of these projects will be discussed as examples in the proposal, and a number of other projects will also benefit from the planned system.As leading house of the IM2 NCCR, Idiap has played a key role in coordinating the set up of acquisition infrastructure such as instrumented smart meeting rooms, time synchronization and alignment of different sources with different bandwidth and resolution. Going from raw data to annotated corpora useful for research also require infrastructures and services that go beyond adding a number of hard discs to a file server. Finally, distributing those datasets to other teams within the NCCR and worldwide to work on the same data represented a breakthrough and Idiap wants to maintain its leadership.The current proposal aims at adding to the Idiap research infrastructure a high bandwidth, high capacity, distributed storage system. In addition to our “traditional” file server (based on NetApp systems), we currently have a total of 50TB of low cost storage space available for static data. Based on raid systems we are limited by the network bandwidth of each system (generally 1 Gbit/s). Furthermore, the increasing number of people using the system, and the growing size of the datasets call for a significant increase in capacity and bandwidth.With the goal of maintaining Idiap’s position as the Swiss leader in the acquisition, storage and distribution of massive amounts of multimedia data (coming from Idiap, as well as other national and international sources), we seek to improve our storage capacity by adding high performance storage. The key idea of the project is to improve the performances through the distribution of storage nodes, the striping of data, and direct access from computing machines, which requires the usage of a file system able to handle concurrent access. We indent to use the Sun Microsystem's Lustre technology which benefits from a successful history of being the file system used in many of the Top500 Supercomputing Sites - see http://top500.org.To implement this we plan to build racks of storage using standard multi disks raid systems (RAID6 systems and SATA disks). Those systems will be connected to our network with high bandwidth (20 times faster than today) which will speedup the time of the experiments. We aim at increasing our storage capacity by adding four racks containing 120TB each.
-