Project

Back to overview

MelanoBase

English title MelanoBase
Applicant Rinaldi Fabio
Number 162758
Funding scheme Interdisciplinary projects
Research institution Institut für Computerlinguistik Universität Zürich
Institution of higher education University of Zurich - ZH
Main discipline Other languages and literature
Start/End 01.03.2016 - 29.02.2020
Approved amount 420'469.00
Show all

All Disciplines (3)

Discipline
Other languages and literature
Information Technology
Molecular Biology

Keywords (4)

molecular biology; semi-automated semantic annotation; information extraction; text mining

Lay Summary (Italian)

Lead
La letteratura scientifica costituisce un enorme archivio di conoscenza che può e deve essere usato per facilitare la ricerca. Lo scopo principale del progetto "MelanoBase" e sviluppare sistemi automatici di estrazione della conoscenza dalla letteratura scientifica. Questa informazione deve poi essere organizzata e strutturata in una base di conoscenza che permetta a degli esperti di ritrovare rapidamente ed in maniera accurata l'informazione di cui hanno bisogno, con riferimento anche alle fonti da cui proviene. La verifica scientifica e diffusione dei risultati verrà effettuata in collaborazione con esperti del dominio.
Lay summary

In sintesi

La letteratura scientifica costituisce un enorme archivio di conoscenza che può e deve essere usato per facilitare la
ricerca. Tuttavia il numero e la complessità delle pubblicazioni rende difficile anche agli esperti mantenere una conoscenza aggiornata e completa del proprio dominio di interesse.

In particolare nel campo biomedico ci sono attualmente circa 25 milioni di pubblicazioni scientifiche, e si stima che in media si pubblichino due articoli per minuto. È evidente che nessun essere umano può essere in grado di analizzare questa mole di conoscenza, anche in un dominio ristretto.

Soggetto e obiettivo

Lo scopo principale del progetto "MelanoBase" e sviluppare sistemi automatici di estrazione della conoscenza dalla letteratura scientifica. Questa informazione deve poi essere organizzata e strutturata in una base di conoscenza che permetta a degli esperti di ritrovare rapidamente ed in maniera accurata l'informazione di cui hanno bisogno, con riferimento anche alle fonti da cui proviene. La verifica scientifica e diffusione dei risultati verrà effettuata in collaborazione con esperti del dominio.

Contesto socio-scientifico

Allo scopo di verificare la qualità dei risultati del progetto abbiamo scelto il melanoma come esempio di malattia tumorale che ha un particolare impatto sociale. Il progetto si propone di identificare nella letterature scientifica tutte le informazioni rilevanti per l'identificazione delle cause e delle potenziali cure di questa forma tumorale particolarmente pericolosa e insidiosa.

Parole chiave

biomedical text mining, scientific literature, information extraction, knowledge representation, melanoma

 
Direct link to Lay Summary Last update: 08.02.2016

Responsible applicant and co-applicants

Employees

Publications

Publication
Exploring the feature space of character-level embeddings
Lauriola Ivano, Campese Stefano, Lavelli Alberto, Rinaldi Fabio, Aiolli Fabio (2020), Exploring the feature space of character-level embeddings, in ESANN, ESANN, -.
Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review
Sheikhalishahi Seyedmostafa, Miotto Riccardo, Dudley Joel T, Lavelli Alberto, Rinaldi Fabio, Osmani Venet (2019), Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review, in JMIR Medical Informatics, 7(2), e12239-e12239.
A new approach and gold standard toward author disambiguation in MEDLINE
Vishnyakova Dina, Rodriguez-Esteban Raul, Rinaldi Fabio (2019), A new approach and gold standard toward author disambiguation in MEDLINE, in Journal of the American Medical Informatics Association (JAMIA), 26(10), 1037-1045.
Approaching SMM4H with Merged Models and Multi-task Learning
Ellendorff Tilia, Furrer Lenz, Colic Nicola, Aepli Noëmi, Rinaldi Fabio (2019), Approaching SMM4H with Merged Models and Multi-task Learning, in Proceedings of the 4th Social Media Mining for Health Applications (\#SMM4H) Workshop & Shared Task, 58-61, ACL, Firenze58-61.
OGER++: hybrid multi-type entity recognition
Furrer Lenz, Jancso Anna, Colic Nicola, Rinaldi Fabio (2019), OGER++: hybrid multi-type entity recognition, in Journal of Cheminformatics, 11(1), 7-7.
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Rinaldi Fabio, Yepes Antonio Jimeno, Holderness Eben, Minard Anne-Lyse, Lavelli Alberto, Pustejovsky James (ed.) (2019), Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019), Association for Computational Linguistics, Hong Kong.
UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition
Furrer Lenz, Cornelius Joseph, Rinaldi Fabio (2019), UZH@CRAFT-ST: a Sequence-labeling Approach to Concept Recognition, in Proceedings of The 5th Workshop on BioNLP Open Shared Tasks, 185-195, Association for Computational Linguistics, Hong Kong185-195.
Learning preferences for large scale multi-label problems
Lauriola Ivano, Polato Mirko, Lavelli Alberto, Rinaldi Fabio, Aiolli Fabio (2018), Learning preferences for large scale multi-label problems, in International Conference on Artificial Neural Networks, 546-555, ICANN, Rhodes, Greece546-555.
Learning Representations for Biomedical Named Entity Recognition
Lauriola Ivano, Riccardo Sella, Fabio Aiolli, Lavelli Alberto, Rinaldi Fabio (2018), Learning Representations for Biomedical Named Entity Recognition, in 2nd workshop on Natural Language for Artificial Intelligence (NL4AI 2018), 83-94, AI*IA, Trento, Italy83-94.
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis
Minard Anne-Lyse, Rinaldi Fabio, Lavelli Alberto (ed.) (2018), Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis, Association for Computational Linguistics, Brussels.
Entity recognition in the biomedical domain using a hybrid approach
Basaldella Marco, Furrer Lenz, Tasso Carlo, Rinaldi Fabio (2017), Entity recognition in the biomedical domain using a hybrid approach, in Journal of Biomedical Semantics, 8(1), 51-51.
Efficient and Accurate Entity Recognition for Biomedical Text
Rinaldi Fabio, Furrer Lenz, Basaldella Marco (2017), Efficient and Accurate Entity Recognition for Biomedical Text, in BioCreative VI Workshop, 195-197, BioCreative, Bethesda, MD, USA195-197.
Improving biocuration of microRNAs in diseases: a case study in idiopathic pulmonary fibrosis
Balderas-Martínez Yalbi Itzel, Rinaldi Fabio, Contreras Gabriela, Solano-Lira Hilda, Sánchez-Pérez Mishael, Collado-Vides Julio, Selman Moisés, Pardo Annie (2017), Improving biocuration of microRNAs in diseases: a case study in idiopathic pulmonary fibrosis, in Database, 2017, 1-15.
OGER: OntoGene?s Entity Recogniser in the BeCalm TIPS Task
Furrer Lenz, Rinaldi Fabio (2017), OGER: OntoGene?s Entity Recogniser in the BeCalm TIPS Task, in BioCreative V.5 Challenge Evaluation Workshop, 175-182, BioCreative, Barcelona, Spain175-182.
Strategies towards digital and semi-automated curation in RegulonDB
Rinaldi Fabio, Lithgow Oscar, Gama-Castro Socorro, Solano Hilda, López-Fuentes Alejandra, Muñiz Rascado Luis José, Ishida-Gutiérrez Cecilia, Méndez-Cruz Carlos-Francisco, Collado-Vides Julio (2017), Strategies towards digital and semi-automated curation in RegulonDB, in Database, 2017, 1-15.
Web Conversations About Complementary and Alternative Medicines and Cancer: Content and Sentiment Analysis
Mazzocut Mauro, Truccolo Ivana, Antonini Marialuisa, Rinaldi Fabio, Omero Paolo, Ferrarin Emanuela, De Paoli Paolo, Tasso Carlo (2016), Web Conversations About Complementary and Alternative Medicines and Cancer: Content and Sentiment Analysis, in Journal of Medical Internet Research, 18(6), e120-e120.
Author Name Disambiguation in MEDLINE Based on Journal Descriptors and Semantic Types
Vishnyakova Dina, Rodriguez-Esteban Raul, Ozol Khan, Rinaldi Fabio (2016), Author Name Disambiguation in MEDLINE Based on Journal Descriptors and Semantic Types, in Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining, OsakaACL, Osaka.
Using a Hybrid Approach for Entity Recognition in the Biomedical Domain
Basaldella Marco, Furrer Lenz, Colic Nicola, Ellendorff Tilia Renate, Tasso Carlo, Rinaldi Fabio (2016), Using a Hybrid Approach for Entity Recognition in the Biomedical Domain, in 7th International Symposium on Semantic Mining in Biomedicine, 11-19, CEUR, Aachen11-19.

Collaboration

Group / person Country
Types of collaboration
Dr. Mocellin, Melanoma Molecular Map Project, University of Padova Italy (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Dr. Soengas, Melanoma group, CNIO Spain (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
- Exchange of personnel
RegulonDB (http://regulondb.ccg.unam.mx/), UNAM Mexico (North America)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
- Exchange of personnel
Dr. Esteban-Rodriguez, Data Science Group, Hoffman-La Roche, Basel, Switzerland Switzerland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Industry/business/other use-inspired collaboration
NLM/NCBI, group Wilbur United States of America (North America)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure

Associated projects

Number Title Start Funding scheme
130558 Semi-automated semantic enrichment of biomedical literature 01.08.2010 Project funding (Div. I-III)
178209 Deep Learning in Multi-type Named Entity Recognition for Biomedical Event Extraction 01.04.2018 Doc.Mobility

Abstract

The main aim of the MelanoBase project is a large-scale automatic extraction of actionable information from the biomedical literature and its integration with existing structured knowledge (life science databases). The innovative outcome of this strategy is to provide users (basic and clinical researchers) with formats that can be more easily queried, and automatically processed, with the purpose of increasing the efficiency of biology research. The specific use-case scenario of melanoma disease has been selected for the histopathological complexity of these lesions, and to provide solutions for the unmet need of separating true drivers of this disease from a myriad of (epi)genetic inconsequential byproducts accumulated during melanoma genesis. The project will pursue a literature-wide and disease-centric approach which sets it apart from comparable projects worldwide. Moreover, close collaborations with experts in the field will streamline validation efforts in clinically-relevant specimens.The importance of knowledge integration across the life sciences, including literature processing, is emphasized by the number of research programs which focus on this goal. For example, the DARPA-funded "Big Mechanism" initiative aims at building computer models of cancer mechanisms using information obtained from automated reading of research papers. The German initiative i:DSem (Integrative Datensemantik in der Systemmedizin) has the goal to promote the development of medicine through the integration of structured and unstructured data sources, including literature processing.The MelanoBase project aims at integrating all available knowledge about melanoma, with particular emphasis on hard-to diagnose lesions and on mechanisms of resistance to clinically approved treatments and compounds in experimental testing. The resulting knowledge resource will be tested in the context of some European leading cancer research centers, and a large pharmaceutical company. Melanoma provides a relevant testbed for experimentation with large-scale literature analysis, not only for the obvious societal relevance of the disease, but also because it represents a prototype of inherently complex and variable tumors. All known oncogenes and tumor suppressor programmes known to date are directly or indirectly deregulated in melanoma.Thus, with over 80,000 mutations described, and plethora of post-transcriptional changes, the amount of interconnections between genes, and altered phenotypes or signalling cascades is so challenging, that a thorough analysis would require a colossal amount of experimental research. High-throughput literature mining can provide useful clues to help prioritize candidate targets, on the basis of evidence from previous experiments.The primary goal of MelanoBase is to enable integration of the unstructured knowledge available in the literature with the structured knowledge deposited in life sciences databases. Additional sources such as, for example, clinical trial reports, systematic reviews (Cochrane), and prescription drug information might also be mined in a second stage of the project. Our ultimate goal is accelerate gene discovery and drug target validation in the area of melanoma.The results of the MelanoBase project will be integrated within the Melanoma Molecular Map repository (http://www.mmmp.org/MMMP) and experimentally relevant information will be validated by well-known experts in the field.
-