computational linguistics; text-to-text generation; synchronous grammars; machine translation; human language technology; linear programming
Violeta Seretan, Eric Wehrli (2013), Syntactic Concordancing and Multi-Word Expression Detection, in International Journal of Data Mining, Modelling and Management
, 5(2), 158-181.
Violeta Seretan (2012), Acquisition of Syntactic Simplification Rules for French, in Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
, Istanbul, Turkey.
Violeta Seretan (2011), A Collocation-Driven Approach to Text Summarization, in Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles
, Montpellier, FranceATALA, Montpellier.
Violeta Seretan, Eric Wehrli (2011), FipsCoView: On-line Visualisation of Collocations Extracted from Multilingual Parallel Corpora, in Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
, Portland, Oregon, USAACL, Portland, Oregon, USA.
Violeta Seretan, Acquisition of Syntactic Text Simplification Rules for French
Violeta Seretan, Text-to-text Generation Methods
This research project in Computational Linguistics focuses on an emerging research area, text-to-text generation, that departs from traditional `concept-to-text' approaches in text generation by capitalising on large-scale language resources available (text corpora, lexical thesauri, parsers) in order to overcome the problem of the reduced availability of rich conceptual information. In this context, the project aims to study the applicability of Integer Linear Programming, an optimisation technique which now begins to be increasingly used in natural language processing, for transforming an input text so that it obeys a set of specific constraints, depending on the desired front-end application. For instance, in machine translation the (often imperfect) output text may be changed so that it becomes more grammatical or more fluent. Novel techniques of machine learning combined with synchronous parsing will be used to automatically learn transformation rules in a data-driven fashion, rather than defining these rules manually. The project will be supported by a renowned expert in the field, Dr Mirella Lapata of the University of Edinburgh, and will give me the opportunity to train in statistical aspects, which are currently underrepresented in my work context at the University of Geneva, but are indispensable for a competitive computational linguistics research profile.