Project

Back to overview

Opinionated and Polarity IR

English title Opinionated and Polarity IR
Applicant Savoy Jacques
Number 124389
Funding scheme Project funding (Div. I-III)
Research institution Institut d'informatique Université de Neuchâtel
Institution of higher education University of Neuchatel - NE
Main discipline Information Technology
Start/End 01.07.2009 - 30.06.2012
Approved amount 159'867.00
Show all

All Disciplines (2)

Discipline
Information Technology
Other languages and literature

Keywords (7)

Information retrieval (IR); blog IR; opinionated IR; discourse analysis; web search; Opinion detection; Search on the Blogs

Lay Summary (English)

Lead
Lay summary
Laysummary: This research focuses on three main objectives. First, we want to design, implement and evaluate information retrieval (IR) systems able to work within the blogosphere (e-mail or related domains), where documents (parts of web pages) would be very short compared to the more traditional test-collections. Second, we want to undertake a more elaborate investigation of opinion-finding IR systems. In this case we will target users who may want to know certain objective information concerning a target entity (or product, service, person). The retrieved system would have to clearly identify the retrieved answers as being factual. On the other hand, the expected answer could be an opinion (personal experience with respect to a product/service or personal opinion(s) on a target entity). In this case we want the IR system to classify the retrieved answers according to their polarity (positive, negative or mixed, showing a variety of opinions). Third, we want to design and implement an opinionated IR system on a macro level. We want to provide answers to questions such as "What are the principle opinions expressed in this document?", "Is this web site in favor or against a given topic?", "Does this document (web site) express homogenous opinions on a given target?" For all these questions we need to summarize a set of opinions or reactions to a given target topic.
Direct link to Lay Summary Last update: 21.02.2013

Responsible applicant and co-applicants

Employees

Name Institute

Publications

Publication
Feature Selection in Sentiment Analysis
Zubaryeva Olena, Savoy Jacques (2012), Feature Selection in Sentiment Analysis, in Actes 9ième COnférence en Recherche d’Information et Applications CORIA’2012, Bordeaux.
Classification Based on Specific Vocabulary
Savoy Jacques, Zubaryeva Olena (2011), Classification Based on Specific Vocabulary, in Proceedings IEEE – WIC – ACM Web Intelligence, Lyon.

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
CORIA 2012 21.03.2012 Bordeaux, mars 2012


Associated projects

Number Title Start Funding scheme
103420 Recherche documentaire en langues arabe et asiatique 01.05.2004 Project funding (Div. I-III)
113273 Multilingual and Contextual Information Retrieval 01.01.2007 Project funding (Div. I-III)

Abstract

This research proposal focuses on three main objectives. First, we want to design, implement and evaluate information retrieval (IR) systems (Baeza-Yates & Ribiero-Neto, 1999; Manning et al., 2008) able to work within the blogosphere (e-mail or related domains), where documents (parts of web pages) would be very short compared to the more traditional test-collections used in the IR domain (e.g., newspapers, legal material). Moreover, the IR system suggested must work with at least three different natural languages. Second, we want to undertake a more elaborate investigation of opinion-finding IR systems. In this case we will target users who may want to know certain objective information concerning a target entity (or product, service, person, event, etc. such as “iPhone,” “Nestlé,” or “bombing in London”). In this case the retrieved system would have to clearly identify the retrieved answers as being factual (the item contains only “objective” data, numbers or facts for example relative to a product description). On the other hand, the expected answer could be an opinion (personal experience with respect to a product/service or personal opinion(s) on a target entity). In this case we want the IR system to classify the retrieved answers according to their polarity (positive, negative or mixed, showing a variety of opinions). As an additional feature, we also want to identify the person(s) expressing the underlying opinion as well as the specific target (strings) expressing opinion within a sentence. In order to obtain the desired performance levels in this second part, we need to combine a traditional IR system with natural language processing (NLP) tools (e.g., morphological analysis, POS taggers) (Nugues, 2006) as well as various electronic resources (e.g., encyclopedia, authority lists, thesauri). Third, we want to design and implement an opinionated IR system on a macro level. Our second objective focuses on the micro level (mainly sentences), wherein we also want to design and implement tools capable of analyzing document (or Web site) content on a higher level (e.g., a chapter or an entire web site). In this part, we want to provide answers to questions such as “What are the principle opinions expressed in this document?”, “Is this web site in favor or against a given topic?”, “Does this document (web site) express homogenous opinions on a given target?” For all these questions we need to summarize a set of opinions or reactions to a given target topic. To answer these three questions, we would like to design a fully automatic system capable of working in a language-independent manner (or at least having a clear interface with the corresponding language). We will thus develop our IR system for at least three different languages.
-