Project

Back to overview

Robust structure learning of Bayesian networks

English title Robust structure learning of Bayesian networks
Applicant Zaffalon Marco
Number 146606
Funding scheme Project funding (Div. I-III)
Research institution Istituto Dalle Molle di studi sull’Intelligenza Artificiale (IDSIA) IDSIA USI-SUPSI
Institution of higher education University of Applied Sciences and Arts of Southern Switzerland - SUPSI
Main discipline Mathematics
Start/End 01.01.2014 - 31.12.2017
Approved amount 236'446.00
Show all

All Disciplines (2)

Discipline
Mathematics
Other disciplines of Engineering Sciences

Keywords (7)

Credal and multi-label classification; Structure learning; Bayesian networks; Multiple models; Averaging; Robustness; Imprecise probability

Lay Summary (Italian)

Lead
Il progetto concerne l'apprendimento robusto della struttura di reti Bayesiane dai dati. Le reti Bayesiane sono modelli matematici per il ragionamento in condizioni di incertezza. Una parte importante del modello è una struttura grafica che rappresenta relazioni di dipendenza tra le variabili. Ci si prefigge di sviluppare algoritmi per l'apprendimento della struttura dai dati in maniera che questa sia affidabile anche in presenza di dati scarsamente informativi.
Lay summary

Le reti Bayesiane sono modelli che permettono di far eseguire ad un computer ragionamenti razionali in condizioni di incertezza. Pertanto sono usate in molti problemi reali per il supporto alle decisioni: ad esempio per la diagnosi medica, il riconoscimento di immagini, la previsione di eventi naturali avversi, ecc.. 

Una rete Bayesiana è composta da due parti: un parte grafica che rappresenta tramite frecce le relazioni di causa-effetto (e più in generale di dipendenza) tra le variabili del dominio in questione; e una parte probabilistica per quantificare numericamente la forza di tali relazioni. Esistono algoritmi che permettono di apprendere entrambe le parti in automatico da dati storici se questi sono rappresentativi del dominio in esame. Questo facilita di molto la creazione di un modello basato su reti Bayesiane. 

Tuttavia non sempre esistono dati a sufficienza o sufficientemente rappresentativi del dominio. In questi casi gli algoritmi di cui sopra possono produrre in particolare una struttura grafica che non corrisponde a quella cercata. Si parla di fragilità di questi algoritmi dovuta a una scarsa informazione presente nei dati. Per ovviare a questi problemi è importante sviluppare una nuova classe di questi algoritmi che siano in grado di produrre delle soluzioni, cosiddette robuste, anche a fronte di dati scarsamente informativi. Questo è lo scopo principale del progetto in questione. Gli algoritmi robusti che ci proponiamo di sviluppare dovranno essere in grado di adattarsi alla quantità di informazione realmente presente nei dati, ad esempio prevedendo la possibilità di fornire delle soluzioni multiple: cioè fornendo più di una rete Bayesiana in output che può essere consistente con in dati a disposizione; quest'insieme di reti convergerà verso una singola rete man mano che i dati, accrescendosi, divengono maggiormente informativi. Alternativamente, dovranno essere in grado di scegliere tra tutte le strutture compatibili con i dati, quella che è maggiormente rappresentativa.

Direct link to Lay Summary Last update: 23.12.2013

Responsible applicant and co-applicants

Employees

Name Institute

Publications

Publication
Approximate structure learning for large Bayesian networks
Scanagatta Mauro, Corani Giorgio, de Campos Cassio Polpo, Zaffalon Marco (2018), Approximate structure learning for large Bayesian networks, in Machine Learning, 107(8-10), 1209-1227.
Entropy-based pruning for learning Bayesian networks using BIC
de Campos Cassio P., Scanagatta Mauro, Corani Giorgio, Zaffalon Marco (2018), Entropy-based pruning for learning Bayesian networks using BIC, in Artificial Intelligence, 260, 42-50.
Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets
Scanagatta Mauro, Corani Giorgio, Zaffalon Marco, Yoo Jaemin, Kang U. (2018), Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets, in International Journal of Approximate Reasoning, 95, 152-166.
Axiomatising incomplete preferences through sets of desirable gambles
Zaffalon M., Miranda E. (2017), Axiomatising incomplete preferences through sets of desirable gambles, in Journal of Artificial Intelligence Research, 60, 1057-1126.
Improved local search in Bayesian networks structure learning
Scanagatta M., Corani G., Zaffalon M. (2017), Improved local search in Bayesian networks structure learning, in Proceedings of The 3rd International Workshop on Advanced Methodologies for Bayesian Networks, Series: Proceedings of Machine Learning Research 74, pp. 45-56.PMLR, Electronic.
Statistical comparison of classifiers through Bayesian hierarchical modelling
Corani G., Benavoli A., Demšar J., Mangili F., Zaffalon M. (2017), Statistical comparison of classifiers through Bayesian hierarchical modelling, in Machine Learning, 106(11), 1817-1837.
Air pollution prediction via multi-label classification
Corani G., Scanagatta M. (2016), Air pollution prediction via multi-label classification, in Environmental Modelling & Software, 80, 259-264.
Bayesian network data imputation with application to survival tree analysis
Rancoita P., Zaffalon M., Zucca E., Bertoni F., De Campos C. (2016), Bayesian network data imputation with application to survival tree analysis, in Computational Statistics and Data Analysis, 93, 373-387.
Conformity and independence with coherent lower previsions
Miranda E., Zaffalon M. (2016), Conformity and independence with coherent lower previsions, in International Journal of Approximate Reasoning, 78, 125-137.
Learning extended tree augmented naive structures
De Campos C., Corani G., Scanagatta M., Cuccu M., Zaffalon M. (2016), Learning extended tree augmented naive structures, in International Journal of Approximate Reasoning, 68, 153-163.
Learning treewidth-bounded Bayesian networks with thousands of variables
Scanagatta M., Corani G., De Campos C., Zaffalon M. (2016), Learning treewidth-bounded Bayesian networks with thousands of variables, in NIPS 2016: Proceedings of the 29th Annual Conference on Neural Information Processing Systems., n.a., 1426–1470.
A Bayesian nonparametric procedure for comparing algorithms
Benavoli A., Corani G., Mangili F., Zaffalon M. (2015), A Bayesian nonparametric procedure for comparing algorithms, in ICML 2015: Proceedings of the 32nd International Conference on Machine Learning., PMLR, PMLR 37, 1264–1272..
Approximate credal network updating by linear programming with applications to decision making
Antonucci A., De Campos C., Huber D., Zaffalon M. (2015), Approximate credal network updating by linear programming with applications to decision making, in International Journal of Approximate Reasoning, 58, 25-38.
Conformity and independence with coherent lower previsions
Miranda E., Zaffalon M. (2015), Conformity and independence with coherent lower previsions, in ISIPTA '15: Proceedings of the Ninth International Symposium on Imprecise Probability: Theories and , pp. 197–206SIPTA, n.a..
Imprecise Dirichlet process with application to the hypothesis test on the probability that X≤Y
Benavoli A., Mangili F., Ruggeri F., Zaffalon M. (2015), Imprecise Dirichlet process with application to the hypothesis test on the probability that X≤Y, in Journal of Statistical Theory and Practice, 9(3), 658-684.
Independent products in infinite spaces
Miranda E., Zaffalon M. (2015), Independent products in infinite spaces, in Journal of Mathematical Analysis and Application, 425, 460-488.
Learning Bayesian networks with thousands of variables
Scanagatta M., De Campos C., Corani G., Zaffalon M. (2015), Learning Bayesian networks with thousands of variables, in NIPS 2015: Proceedings of the 28th Annual Conference on Neural Information Processing Systems., Curran Associates, 1855–1863.
On the problem of computing the conglomerable natural extension
Miranda E., Zaffalon M. (2015), On the problem of computing the conglomerable natural extension, in International Journal of Approximate Reasoning, 56(A), 1-27.
Prior near ignorance for inferences in the k-parameter exponential family
Benavoli A., Zaffalon M. (2015), Prior near ignorance for inferences in the k-parameter exponential family, in Statistics, 49(5), 1104-1140.
Comments on “Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks” by Andres R. Masegosa and Serafın Moral
Zaffalon M., Corani G. (2014), Comments on “Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks” by Andres R. Masegosa and Serafın Moral, in International Journal of Approximate Reasoning, 55(7), 1579-1600.
A Bayesian Wilcoxon signed-rank test based on the Dirichlet process
Benavoli A., Mangili F., Corani G., Zaffalon M., Ruggeri F., A Bayesian Wilcoxon signed-rank test based on the Dirichlet process, in ICML 2014: Proceedings of the 31st International Conference on Machine Learning, PMLR, PMLR 32, 1026–1034..
Extended tree augmented naive classifier
De Campos C., Cuccu M., Corani G., Zaffalon M., Extended tree augmented naive classifier, in PGM'14: Proceedings of the Seventh European Workshop on Probabilistic Graphical Models, Springer, Lecture Notes in Computer Science 8754, 176–189..
Min-BDeu and max-BDeu scores for learning Bayesian networks
Scanagatta M., De Campos C., Zaffalon M., Min-BDeu and max-BDeu scores for learning Bayesian networks, in PGM'14: Proceedings of the Seventh European Workshop on Probabilistic Graphical Models, Springer, Lecture Notes in Computer Science 8754, 426–441..

Associated projects

Number Title Start Funding scheme
121785 Ignorance, caution and the naive credal classifier 01.01.2009 Project funding (Div. I-III)
134759 Credal networks made easy 01.04.2011 Project funding (Div. I-III)
132252 Multi Model Inference for dealing with uncertainty in environmental models 01.10.2010 Project funding (Div. I-III)
162188 Statistical learning and inference on big data with probabilistic graphical models 01.01.2016 South Korea
109295 Uncertain reasoning under incompleteness 01.10.2005 Project funding (Div. I-III)
118071 Credal Model Averaging for dealing with uncertainty in environmental models 01.10.2008 Project funding (Div. I-III)
116674 Unifying graphical models by credal networks: algorithms and applications 01.03.2008 Project funding (Div. I-III)
137680 Learning under near-ignorance: models and methods 01.01.2012 Project funding (Div. I-III)

Abstract

Bayesian networks mix graphs and probabilities in order to perform reasoning under uncertainty. Their main advocate, Judea Pearl, has been awarded in 2011 the prestigious "ACM Turing Award" because, among other things, "this work not only revolutionized the field of artificial intelligence but also became an important tool for many other branches of engineering and the natural sciences". The graph of a Bayesian network describes the dependence relations between the variables in a domain, while its (probabilistic) parameters encode their strengths. A Bayesian net can be learned from data, be queried to assess the probability of events given observations, and help on decision making.For Bayesian networks to be successful, a crucial, and a very challenging, task that must often be addressed is how to learn the dependency graph (the "structure") from data. This is typically achieved by searching the space of graphs while trying to optimize a score that measures how well a graph fits the data (or other domain knowledge).Structure learning has been, and still is, the subject of intense, cutting-edge, research. At present time, there are very sophisticated algorithms for this task. However, these methods fall short to consider the reliability of their results. Two undesired outcomes are then possible: (i) that there is a structure with a slightly better fit than the others, but which is heavily prone to structural changes under small variations in the data (or personal beliefs); (ii) that there are many structures with similar score that lead to different outcomes in applications, so that the choice among them is highly arbitrary. Problem (i) is severe in applications, such as genomics, where the focus is often on understanding the underlying mechanism that should be modeled by the network. Problem (ii) negatively affects the application of Bayesian networks to prediction as in the data-mining task of "pattern classification", because structures with similar score can, in some cases, lead to quite different prediction accuracy.In this project we aim at developing reliable (or "robust") structure learning methods. Our standpoint is that to achieve reliability, we should have methods able to infer from data, and work with, multiple candidate structures. In this we are inspired by the field of "imprecise" (or "credal") probability, where it is widely acknowledged that robustness is entangled with sets of models (and in particular sets of probability distributions). We will proceed according to the following plan:- We will start by studying current methods for structure learning and the reliability of their results, through *sensitivity analysis of optimal structures*. The idea is to understand which characteristics can be borrowed from them.- We will *develop algorithms that deliver sets of candidate structures*. To this end, we will first deliver new, state-of-the-art, algorithms for the traditional case of (standard) structure learning. Then we will work upon them for our set-based methods using k-best ideas (i.e., finding the k most-probable structures) when the focus is on precise probability, as well as imprecise-probability criteria, such as maximality (which will yield the more structures the less information in the data).- We will *develop robust score functions and models based on imprecise probability*. We will employ decision criteria such as "maximum entropy" and "maximin" in order to produce a single, robust, structure out of the multiple ones when this has to be the case. We will also keep on working with multiple structures and, whenever needed, probabilistically average over them (by "credal model averaging" in the case of maximality, and Bayesian model averaging in the k-best case).- We will *extend the above methods to the case of incomplete data*. Few approaches handle missing data to date. - We will develop these methods with (some) *accuracy guarantees, and we will formally study their complexity*.- We will *specialize robust structure learning to problems of multi-label classification* (where an object can belong to multiple categories). Current methods are still limited in this.- We will *apply robust structure learning to genomic and clinical real data sets*. Working with these data sets is especially relevant, because dealing with incomplete and multi-label data is often unavoidable. Also, by using more reliable procedures, we intend to improve on current methods, yielding results that have pratical impact.
-