Data and Documentation
Open Data Policy
FAQ
EN
DE
FR
Suchbegriff
Advanced search
Publication
Back to overview
OGER++: hybrid multi-type entity recognition
Type of publication
Peer-reviewed
Publikationsform
Original article (peer-reviewed)
Author
Furrer Lenz, Jancso Anna, Colic Nicola, Rinaldi Fabio,
Project
MelanoBase
Show all
Original article (peer-reviewed)
Journal
Journal of Cheminformatics
Publisher
BioMed Central
Volume (Issue)
11(1)
Page(s)
7 - 7
Title of proceedings
Journal of Cheminformatics
DOI
10.1186/s13321-018-0326-3
Open Access
URL
https://jcheminf.biomedcentral.com/articles/10.1186/s13321-018-0326-3
Type of Open Access
Publisher (Gold Open Access)
Abstract
Background: We present a text-mining tool for recognizing biomedical entities in scientific literature. OGER++ is a hybrid system for named entity recognition and concept recognition (linking), which combines a dictionary-based annotator with a corpus-based disambiguation component. The annotator uses an efficient look-up strategy combined with a normalization method for matching spelling variants. The disambiguation classifier is implemented as a feed-forward neural network which acts as a postfilter to the previous step. Results: We evaluated the system in terms of processing speed and annotation quality. In the speed benchmarks, the OGER++ web service processes 9.7 abstracts or 0.9 full-text documents per second. On the CRAFT corpus, we achieved 71.4\% and 56.7\% F1 for named entity recognition and concept recognition, respectively. Conclusions: Combining knowledge-based and data-driven components allows creating a system with competitive performance in biomedical text mining.
-