Data and Documentation
Open Data Policy
Back to overview
Multi-word Adverbs – How well are they handled in Parsing and Machine Translation?
Type of publication
Volk Martin, Graën Johannes,
SPARCLING: Large-scale Annotation and Alignment of Parallel Corpora for the Investigation of Linguistic Variation
Title of proceedings
The 3rd Workshop on Multi-word Units in Machine Translation and Translation Technology
Multi-word expressions are often considered problematic for parsing or other tasks in natural language processing. In this paper we investigate a specific type of multi-word expressions: binomial adverbs. These adverbs follow the pattern adverb + conjunction + adverb. We identify and evaluate binomial adverbs in English, German and Swedish. We compute their degree of idiomaticity with an ordering test and with a mutual information score. We show that these idiomaticity measures point us to a number of fixed multi-word expressions which are often mis-tagged and mis-parsed. Interestingly, a second evaluation shows that state-of-the-art machine translation handles them well – with some exceptions.