Project

Back to overview

Learning under near-ignorance: models and methods

English title Learning under near-ignorance: models and methods
Applicant Zaffalon Marco
Number 137680
Funding scheme Project funding (Div. I-III)
Research institution Istituto Dalle Molle di studi sull’Intelligenza Artificiale (IDSIA) IDSIA USI-SUPSI
Institution of higher education University of Applied Sciences and Arts of Southern Switzerland - SUPSI
Main discipline Information Technology
Start/End 01.01.2012 - 31.12.2014
Approved amount 220'000.00
Show all

Keywords (5)

Imprecise probability; Exponential family; Near-ignorance; Learning; Lower and upper expectations

Lay Summary (English)

Lead
Lay summary

Can a rational agent learn about a domain from data starting in a condition of ignorance?

Addressing this question is important because it is often desirable to start analyzing a problem with very weak initial assumptions, thus trying to implement an objective-minded approach to learning and inference. 

This question has been most thoroughly investigated so far within the well-established Bayesian learning theory. In this case, ignorance is modeled by a so-called noninformative prior distribution (e.g., a flat one); data are summarized by a likelihood model. Posterior inferences (e.g., predictions, hypothesis testing, etc.) are obtained from the prior and the likelihood by Bayes' rule. However, the use of a single - albeit noninformative - prior as a model of ignorance has been criticized by several authors. 

A more compelling alternative, in our opinion, relies on using a set of priors that generate vacuous lower and upper expectations: that is, an expectation interval whose extreme points coincide with the infimum and the supremum values a random variable can take. This expresses very faithfully one's ignorance about a random variable. These models have been first suggested  by Walley with the name of "near-ignorance priors". They have been successfully applied to classification problems: by naturally leading to indeterminate (i.e., set-valued) predictions, when the information from data is not enough to draw stronger conclusions, they originate credible and reliable models. However, we have shown that in some cases near-ignorance originates the problem to make learning from data impossible: the set of posteriors may coincide again with the set of priors. This amounts to making data useless, which is clearly a critical issue.

To tackle this problem, we have established some general principles that should govern a well-defined learning process based on near-ignorance, and have specialized them for the case of the one-parameter exponential family. Then we have proven that there is a unique set of priors that complies with these principles, and hence that guarantees both near-ignorance and learning to be attained. We have also shown that the obtained set of priors reduces to other already known models of prior near-ignorance and thus, indirectly, proven the uniqueness of these models. Moreover, we have derived new models of near-ignorance that were not available before. We regard these results as a basic stepping stone to properly answer the initial question. Such seminal work should now be deepened and extended to fully tackle the question of learning under near-ignorance, and to make it closer to applications. We plan to do this according to the following tasks:

- Extension to the multivariate exponential family: this extension is fundamental as the most  important applications are based on it; this is for instance the case of regression analysis.

- Linear regression: to employ the multivariate model to develop new robust linear regression algorithms. Linear regression is an important area of data analysis, where near-ignorance can provide a leading approach for reliable inference.

- Metrics: to develop new performance metrics to compare a set-valued estimate/prediction with a single-valued one. In fact, while standard estimators yield determinate (or single-valued) outcomes, those based on near-ignorance lead to set-valued ones. The problem of how to fairly compare through a metric a set-valued estimate with a single-valued estimate is thus fundamental to judge the quality of the models in practice.

- Extension to the non-parametric case: this extension is important as non-parametric models are spreading in the context of artificial intelligence, because of the increasing computational power. Modeling near-ignorance with a non-parametric model can allow to build very general models for statistical inference.

- Understanding the ultimate relationship between near-ignorance and learning: we plan to go deeper into these questions, exploiting some key results derived in the context of Bayesian inference for the consistency and convergence of posterior distributions.

Direct link to Lay Summary Last update: 21.02.2013

Responsible applicant and co-applicants

Employees

Associated projects

Number Title Start Funding scheme
162188 Statistical learning and inference on big data with probabilistic graphical models 01.01.2016 South Korea
67961 Investigations on the theory and practice of credal classification 01.04.2003 Project funding (Div. I-III)
121785 Ignorance, caution and the naive credal classifier 01.01.2009 Project funding (Div. I-III)
146606 Robust structure learning of Bayesian networks 01.01.2014 Project funding (Div. I-III)
132252 Multi Model Inference for dealing with uncertainty in environmental models 01.10.2010 Project funding (Div. I-III)
134759 Credal networks made easy 01.04.2011 Project funding (Div. I-III)
118071 Credal Model Averaging for dealing with uncertainty in environmental models 01.10.2008 Project funding (Div. I-III)
113820 Statistical learning from imperfect observations under prior ignorance 01.10.2006 Project funding (Div. I-III)

Abstract

Can a rational agent learn about a domain from data starting in a condition of ignorance? Addressing this question is important because it is often desirable to start analyzing a problem with very weak initial assumptions, thus trying to implement an objective-minded approach to learning and inference. This question has been most thoroughly investigated so far within the well-established Bayesian learning theory. In this case, ignorance is modeled by a so-called noninformative prior distribution (e.g., a flat one); data are summarized by a likelihood model. Posterior inferences (e.g., predictions, hypothesis testing, etc.) are obtained from the prior and the likelihood by Bayes' rule.The use of a single - albeit noninformative - prior as a model of ignorance has been criticized by several authors. A more compelling alternative, in our opinion, relies on using a set M of priors that generate lower (\lpr) and upper (\upr) expectations s.t. \lpr(g)=\inf_x g(x) and \upr(g)=\sup_x g(x) for suitable functions of interest g. This expresses very faithfully that one's expectations about g are totally vacuous, which is a clear state of ignorance. These models have been first suggested in [walley, 1991] with the name of "near-ignorance priors". These have been successfully applied to classification problems: by naturally leading to indeterminate (i.e., set-valued) predictions, when the information from data is not enough to draw stronger conclusions, they originate very credible and reliable models. However, in some cases near-ignorance originates also the fundamental problem [piatti et al., 2007] to make learning from data impossible: the set of posteriors may coincide again with the set of priors. This amounts to making data useless, which is clearly a serious issue.To tackle this problem, we have established some general principles that should govern a well-defined learning process based on near-ignorance, and have specialized them for the case of the one-parameter exponential family. Then we have proved that there is a unique set of priors that complies with these principles, and hence that guarantees both near-ignorance and learning to be attained. We have also shown that the obtained set of priors reduces to other already known models of prior near-ignorance and thus, indirectly, proven the uniqueness of these models. Moreover, we have derived new models of near-ignorance that were not available before. We regard these results as a basic stepping stone to properly answer the initial question. This seminal work should now be deepened and extended to fully tackle the question of learning under near-ignorance, and to make it closer to applications. We plan to do this according to the following tasks.- Extension to the multivariate exponential family: this extension is fundamental as the most important applications are based on it; this is for instance the case of regression analysis.- Linear regression: to employ the multivariate model to develop new robust linear regression algorithms. Linear regression is an important area of data analysis, where near-ignorance can provide a leading approach for reliable inference.- Metrics: to develop new performance metrics to compare a set-valued estimate/prediction with a single-valued one. In fact, while standard estimators yield determinate (or single-valued) outcomes, those based on near-ignorance lead to set-valued ones. The problem of how to fairly compare through a metric a set-valued estimate with a single-valued estimate is thus fundamental to judge the quality of the models in practice.- Extension to the non-parametric case: this extension is important as non-parametric models are spreading in the context of artificial intelligence, because of the increasing computational power. Modeling near-ignorance with a non-parametric model can allow to build very general models for statistical inference.- Understanding the ultimate relationship between near-ignorance and learning: we plan to go deeper into these questions, exploiting important results derived in the context of Bayesian inference for the consistency and convergence of posterior distributions.
-