Project

Back to overview

Bio-SODA: Enabling Complex, Semantic Queries to Bioinformatics Databases through Intuitive Searching over Data

English title Bio-SODA: Enabling Complex, Semantic Queries to Bioinformatics Databases through Intuitive Searching over Data
Applicant Stockinger Kurt
Number 167149
Funding scheme NRP 75 Big Data
Research institution ZHAW Zürcher Hochschule für Angewandte Wissenschaften
Institution of higher education Zurich University of Applied Sciences - ZHAW
Main discipline Information Technology
Start/End 01.04.2017 - 31.03.2021
Approved amount 637'913.00
Show all

Keywords (4)

Query processing; Bioinformatics; Semantic search; Data integration

Lay Summary (German)

Lead
Komplexe Bioinformatikdatenbanken bergen enormes Wissen, das aber nur mit technischem Know-How abgefragt werden kann. Ziel des Projekts ist es, eine intuitive, Google-ähnliche Suchfunktion zu entwickeln, die helfen soll, neue Zusammenhänge in den gespeicherten Daten zu erkennen
Lay summary

Die Aufgabe dieses Projekts ist vergleichbar mit der Übersetzung von einer Sprache in eine andere. Unter Verwendung dieser Analogie kann man die Sprachen, die für die Abfrage von Bioinformatikdaten benutzt werden, mit Esperanto und Latein vergleichen. Werden diese Sprachen nicht oder nur schlecht beherrscht, lassen sich nur beschränkte biowissenschaftliche Erkenntnisse gewinnen, da die Kommunikation mit dem System harzt. BioSODA (Search Over DAta Warehouse for Biology) soll intuitive Suchbegriffe in komplexe Suchanfragen der Datenbanken umwandeln.

 

Rasante Fortschritte in der DNA-Sequenzierung machen die Biowissenschaften zu einer sehr datenintensiven Disziplin. Unmengen an Bioinformatikdaten sind in komplexen Datenbanken gespeichert, die zwar auf mächtigen Technologien beruhen, doch für Abfragen viel Hintergrundwissen in Informatik benötigen. Für die effiziente Analyse von Dutzenden von Bioinformatikdatenbanken werden neue Suchtechnologien benötigt.

Dieses Projekt entwickelt neue Google-ähnliche Suchmöglichkeiten, so dass Forschende die Datenbanken intuitiv abfragen und sich auf wissenschaftliche Fragestellungen konzentrieren können.

BioSODA ermöglicht das einfachere Abfragen der riesigen Mengen an Bioinformatikdaten. Das Programm macht auch Suchvorschläge, um Informationen anzuzeigen, nach denen nicht ausdrücklich gesucht wurde. Wir erwarten uns einen einfacheren Zugang zu Wissen und somit raschere Kenntnisse von vielleicht noch unbekannten, biologischen Zusammenhängen.

Direct link to Lay Summary Last update: 26.07.2017

Lay Summary (French)

Lead
Les banques de données bioinformatiques complexes renferment énormément de connaissances, mais seul un bagage technique permet d’y accéder. Le projet a pour objectif de développer une fonction de recherche intuitive comparable à Google. Elle doit aider à reconnaître de nouveaux phénomènes dans les données enregistrées.
Lay summary

Si les langages employés pour consulter des données bioinformatiques sont peu ou mal maîtrisés, la difficulté de communiquer avec le système ne donne accès qu’à un nombre limité de connaissances en science du vivant. BioSODA (Search Over DAta Warehouse for Biology) est censé transformer des mots-clés intuitifs en requêtes de recherche complexes.

De par les progrès fulgurants réalisés dans le séquençage de l’ADN, les sciences du vivant génère un très grand volume d’informations bioinformatiques enregistrées dans des banques de données complexes. Bien qu’elles se basent sur des technologies robustes, leur consultation nécessite des connaissances approfondies en informatique. De nouvelles approches sont nécessaires pour une analyse plus efficace.

Ce projet développe des outils de recherche comparables au moteur de recherche Google, afin que les chercheurs puissent consulter les banques de données de manière intuitive et se concentrer sur des questions scientifiques.

BioSODA permet de consulter plus facilement les énormes quantités de données bioinformatiques. Le programme formule aussi des propositions de recherche, afin de mettre en évidence des informations qui ne sont pas expressément recherchées. Nous espérons obtenir ainsi un accès plus facile au savoir et développer plus rapidement des connaissances sur des phénomènes biologiques encore inconnus.


Direct link to Lay Summary Last update: 26.07.2017

Lay Summary (English)

Lead
Complex bioinformatics databases hold enormous amounts of knowledge that can only be retrieved with technical know-how. The goal of this project is to develop an intuitive, Google-like search function designed to help identify new correlations in the stored data.
Lay summary
The project remit is comparable to translating one language into another. Continuing with this analogy, the languages used for retrieving bioinformatics data can be compared with Esperanto and Latin. With no or merely a poor command of these languages, only limited bioscientific findings can be obtained since communication with the system is laborious. The objective behind BioSODA (Search Over DAta Warehouse for Biology) is to convert intuitive search terms into complex search queries.

Rapid advances in DNA sequencing are transforming biosciences into a highly data-intensive discipline. Vast quantities of bioinformatics data are stored in complex databases which are built on powerful technologies, but also demand a great deal of background information technology expertise when it comes to retrieval. New search technologies are needed to efficiently analyse dozens of bioinformatics databases.

The present project is developing novel Google-like search options that allow researchers to query databases intuitively and concentrate on scientific questions.

BioSODA makes running queries in huge quantities of bioinformatics data an easier task. The program also makes search suggestions in order to display information that has not been expressly sought. We aim to achieve easier access to knowledge and thus gain faster insights into perhaps still unknown biological correlations.

Direct link to Lay Summary Last update: 26.07.2017

Responsible applicant and co-applicants

Employees

Publications

Publication
On the Move to Meaningful Internet Systems: OTM 2019 ConferencesConfederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, October 21–25, 2019, Proceedings
Mendes de Farias Tarcisio, Stockinger Kurt, Dessimoz Christophe (2019), On the Move to Meaningful Internet Systems: OTM 2019 ConferencesConfederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, October 21–25, 2019, Proceedings, in On the Move to Meaningful Internet Systems: OTM 2019 Conferences, Rhodes, GreeceSpringer International Publishing, Cham.
A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL
Sima Ana Claudia, Dessimoz Christophe, Stockinger Kurt, Zahn-Zabal Monique, Mendes de Farias Tarcisio (2019), A hands-on introduction to querying evolutionary relationships across multiple data sources using SPARQL, in F1000Research, 8, 1822-1822.
Evolutionary GenomicsStatistical and Computational Methods
Sima Ana Claudia, Stockinger Kurt, de Farias Tarcisio Mendes, Gil Manuel (2019), Evolutionary GenomicsStatistical and Computational Methods, Springer New York, New York, NY.
Enabling semantic queries across federated bioinformatics databases
Sima Ana Claudia, Mendes de Farias Tarcisio, Zbinden Erich, Anisimova Maria, Gil Manuel, Stockinger Heinz, Stockinger Kurt, Robinson-Rechavi Marc, Dessimoz Christophe (2019), Enabling semantic queries across federated bioinformatics databases, in Database, 2019, n/a-n/a.
Leveraging logical rules for efficacious representation of large orthology datasets
T. M. de Farias et al., Leveraging logical rules for efficacious representation of large orthology datasets, in International Semantic Web Applications and Tools for Healthcare and Life Sciences (SWAT4HCLS), Antwerp, Belgum?, Antwerp, Belgum.

Collaboration

Group / person Country
Types of collaboration
Quest for Orthologs consortium Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Uberon collaboration United States of America (North America)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
ELIXIR Europe (www.elixir-europe.org) Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
Bayer CropScience Belgium (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Exchange of personnel
- Industry/business/other use-inspired collaboration

Scientific events



Self-organised

Title Date Place
Usability Workshop 31.01.2019 Lausanne, Switzerland

Knowledge transfer events

Active participation

Title Type of contribution Date Place Persons involved
Workshop with 3 SFN Big Data projects with biologic application Workshop 18.12.2017 Bern, Switzerland Dessimoz Christophe; Stockinger Heinz; Anisimova Maria; Stockinger Kurt; Sima Ana Claudia; Mendes de Farias Tarcisio; Robinson-Rechavi Marc; Zbinden Erich;


Communication with the public

Communication Title Media Place Year
Media relations: print media, online media Video about the Bio-SODA Project ZHAW, School of Engineering German-speaking Switzerland 2017

Associated projects

Number Title Start Funding scheme
186397 An integrated evolutionary and functional characterisation of the Drosophila immune peptidic secretome 01.07.2019 Sinergia
183723 Embracing Phylogenetic Incongruence Among Genetic Loci 01.09.2019 SNSF Professorships

Abstract

One of the major promise of Big Data lies in the simultaneous mining of multiple sources of data. This is particularly important in life sciences, where different and complementary data are scattered across multiple resources. To overcome this issue, the use of RDF/semantic web technology is emerging, but querying these systems often proves to be too complex for most users-thereby hampering wide development and adoption of these technologies. This project aims at enabling sophisticated semantic queries across large, decentralized and heterogeneous databases via an intuitive interface. The system will enable scientists, without prior training, to perform powerful joint queries across resources in ways that cannot be anticipated and therefore goes far beyond the query functionality of specialized knowledge bases.The project represents an interdisciplinary collaboration between information systems and bioinformatics-directly building upon the team’s prior experience in integrating and querying databases at a major Swiss bank, in developing world-leading bioinformatics databases, in combining biological ontologies for data analysis, and in maintaining the highly accessed bioinformatics resource portal ExPASy.
-