Project

Back to overview

Virosaurus: a virus references database for High-Throughput Sequencing in clinical virology

English title Virosaurus: a virus references database for High-Throughput Sequencing in clinical virology
Applicant Le Mercier Philippe
Number 189179
Funding scheme Project funding
Research institution Institut Suisse de Bioinformatique Centre Médical Universitaire Université de Genève
Institution of higher education Swiss Institute of Bioinformatics - SIB
Main discipline Medical Microbiology
Start/End 01.12.2019 - 30.11.2021
Approved amount 109'520.00
Show all

All Disciplines (2)

Discipline
Medical Microbiology
Infectious Diseases

Keywords (6)

HTS diagnostics; Virus diagnostics; Clinical virology; Reference genome; complete genomes; virus discovery

Lay Summary (French)

Lead
Virosaurus: une base de donnée au service du diagnostic des infections virales par séquençage.
Lay summary
Les outils des diagnostic d'infection virale requièrent des outils spécifiques pour chaque type de virus. Nous ne pouvons donc détecter que ce que nous regardons. Qui plus est,  la diversité biologique de certains virus les rendent relativement opaque à ces méthodes de détection, comme par exemple le virus responsable de la fièvre hémorragique de Lassa.

Le séquençage à haut débit (HTS)  offre de nouvelles opportunités, puisque en une seule analyse il est possible d'identifier tous les virus présents dans un échantillon. La robustesse de cette nouvelle méthode dépends de la base de donnée de référence afin de pouvoir détecter tous les variants circulants. Virosaurus (pour "virus thesaurus") est une base de donnée créée dans ce but. Partant de toutes les séquences virales connues, des algorithmes nous permettent d'identifier les  séquences nécessaires et suffisantes pour couvrir tous les variant viraux des virus de vertébrés. Toutes ces séquences sont annotées afin de faciliter le travail de diagnostic: noms d'espèces,  nature du génome, différents sous-types...
Nous souhaitons ainsi créer une base de donnée libre d'accès qui puisse faciliter le diagnostic par séquençage, fournir un standard de référence à tous les hôpitaux et la mettre à jour chaque année. De plus ce travail fournira pour la première fois une base de donnée des génomes viraux complets utile pour la recherche fondamentale.
Toutes ces données seront accessible librement sur le site académique ViralZone.



Direct link to Lay Summary Last update: 25.11.2019

Responsible applicant and co-applicants

Employees

Project partner

Publications

Publication
Virosaurus A Reference to Explore and Capture Virus Genetic Diversity
Gleizes Anne, Laubscher Florian, Guex Nicolas, Iseli Christian, Junier Thomas, Cordey Samuel, Fellay Jacques, Xenarios Ioannis, Kaiser Laurent, Mercier Philippe Le (2020), Virosaurus A Reference to Explore and Capture Virus Genetic Diversity, in Viruses, 12(11), 1248-1248.

Datasets

Complete virus sequences, Virosaurus90, Virosaurus98

Author Anne, Gleizes; Le Mercier, philippe; de Castro, Edouard
Persistent Identifier (PID) https://viralzone.expasy.org/8676
Repository ViralZone


Collaboration

Group / person Country
Types of collaboration
Clinical Microbiological Laboratory, department of Medical Microbiology, Leiden University Medical C Netherlands (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
viruses in silico | the EVBC lecture series Individual talk Viral reference database as a critical factor for clinical metagenomics: a review using Virosaurus as an example. 14.06.2021 Virtual conference, Germany Le Mercier Philippe;


Abstract

Identification of viral pathogens in clinical samples involves immunodetection, or polymerase chain reaction (PCR) methods. These tests are individual and specific for each virus; we detect what we are looking for. This makes challenging the diagnostic of highly variable viruses like Lassa, and makes difficult identifying unknown viruses. High-Throughput Sequencing (HTS) technology allows analyzing a sample for all genetic material without any preconception. It reveals all genetic entities present in a sample: this method offers promising perspectives for virus detection and diagnosis (Chiu and Miller, 2019). One of the key elements for HTS diagnosis is the genomic reference database used to identify a virus in the millions of reads produced by these technologies. We decided to design a robust database for virus diagnostics that could be made available to all public institutions.Recently, SIB Swiss Institute of Bioinformatics (SwissProt and Vital-IT), Hôpitaux Universitaires de Genève and Centre Hospitalier Universitaire Vaudois have collaborated to develop a vertebrate virus database for clinical detection. The result was satisfactory and convinced us to maintain and upgrade this resource. We propose to extend the database to non-vertebrate eukaryotic viruses (insect, plant, fungi viruses). This means first updating our database and processing all the new data. We plan to add annotation to the sequences to facilitate clinical interpretation of results. For example by adding virus subtype data that matters for treatment, like polio or non-polio enterovirus. We plan to test the new database by using clinical samples and sequencing simulations, and to benchmark its performance to RVDB, a virus database released by US FDA in March 2018. We hope to provide a comprehensive and academic free-access database that may be helpful to and standardize the use of HTS in Switzerland. Moreover, this work will result in the identification of all eukaryotic virus complete genomes, a dataset that is missing for many bioinformatics resources. All data will be made available in specific ViralZone web-page.
-