Gaussian processes; Kalman filtering and smoothing; Nonlinear Bayesian nonparametrics; Regression and classification; State-space representation
Benavoli Alessio, Facchini Alessandro, Piga Dario, Zaffalon Marco (2019), Sum-of-squares for bounded rationality, in International Journal of Approximate Reasoning
, 105, 130-152.
Azzimonti Dario, Ginsbourger David, Rohmer Jérémy, Idier Déborah (2019), Profile extrema for visualizing and quantifying uncertainties on excursion regions. Application to coastal flooding, in Technometrics
Piga Dario (2019), Semialgebraic Outer Approximations for Set-Valued Nonlinear Filtering, in Proc. on European Control Conference (ECC)
, NapoliIEEE, -.
Azzimonti Dario, Rottondi Cristina, Tornatore Massimo (2019), Using Active Learning to Decrease Probes for QoT Estimation in Optical Networks, in Optical Fiber Communication Conference
, San Diego, CaliforniaOSA Publishing, Optical Society of America, 2019.
BenavoliAlessio (2018), Dual Probabilistic Programming, in Proc. On PROBPROG 2018 The International Conference on Probabilistic Programming
, Boston, US-, -.
Azzimonti D. Ginsbourger D. Chevalier C. Bect J. and Richet (2018), Adaptive design of experiments for conservative estimation of excursion sets
, Isaac Newton Institute for Mathematical Science, Cambridge, UK.
Big data analytics is the process of examining big amounts of data to uncover hidden patterns, unknown relations and other useful information that can be used to take better decisions. It will be the upcoming key tool for "internet of things", business intelligence, quantitivative finance -just to mention a few. Big data severely constrain the type of algorithms we can use on them, because, for instance, even models that take time quadratic in the size of the data most probably will not work fast enough to process the data in the needed time. Much research has thus opted for using very simple algorithms, or subsampling of the data, or algorithms based on (deep) neural nets, which are universal approximators that are relatively viable with big data, even though they give no reliability guarantees and are not easy to design and train.Now imagine an alternative to all these approaches that is principled, sophisticated, naturally comes with measures of reliability, is much simpler to train than neural nets, and that automatically adapts the complexity of the model to the size of data. This alternative exists: it is Gaussian Processes (GPs), which is part of non-linear Bayesian non-parametrics. Its being non-linear means that it can capture very general trends in multidimensional data; non-parametric means that it makes very weak assumptions, so it leads to more reliable models; and being Bayesian means that there is solid theory behind it on which we can do maths to derive algorithms and prove their properties. GPs are an emerging tool in machine learning, and yet large data problems are mostly uncharted territory for them: in fact, they can only be applied to at most a few thousand training points n, due to the O(n^3) time and O(n^2) space required for learning.The ambitious goal of this project is very simple to state: we want to derive a principled, accurate, approximation of GPs that can exploit all the available data while taking only O(n) complexity both in time and space. Stated differently, to achieve a great modelling power in a time and space complexity that is the minimum possible when taking all the data into account. This will allow an entire new range of possible applications to be addressed by machine learning, namely, all those whose data are currently too big to be analysed. It has the potential to create groundbreaking new applications and tools, and to overcome the limitations of deep neural networks. To show this potential, we will apply it to MeteoSwiss' rainfall intensity forecast as well as Armasuisse's electrosmog spatio-temporal estimation.