Project

Back to overview

Robust Deep Density Models for High-Energy Particle Physics and Solar Flare Analysis (RODEM)

Applicant Fleuret François
Number 193716
Funding scheme Sinergia
Research institution Computer Science Department Université de Genève
Institution of higher education University of Geneva - GE
Main discipline Interdisciplinary
Start/End 01.12.2020 - 30.11.2024
Approved amount 2'654'095.00
Show all

All Disciplines (4)

Discipline
Interdisciplinary
Astronomy, Astrophysics and Space Sciences
Particle Physics
Information Technology

Keywords (7)

generative models; anomaly detection; large-scale machine learning; deep learning; high energy physics; solar astronomy; artificial intelligence

Lay Summary (French)

Lead
Le projet RODEM vise à développer de nouvelles techniques d'intelligence artificielle utilisant de l'apprentissage afin de suppléer les modélisations physiques traditionnelles.Nous nous intéressons à deux applications dans les domaines de la physique des particules à hautes énergies d'une part, et de l'astronomie solaire d'autre part.
Lay summary
En physique des particules, notre objectif est de développer de nouveaux détecteurs d'anomalies, afin d'isoler dans des grands ensembles d'évènement générés à l'aide d'accélérateurs de particules, lesquels divergent des modèles physiques standards, et devraient donc être pris en compte pour élaborer de nouvelles théories physiques.

En astronomie solaire, nous voulons élaborer de nouvelles techniques de prédictions d'éruptions solaires, capable de déterminer avec plusieurs heures ou jours d'avance si de telles éruptions auront lieu, étant données des images de la surface du soleil. Prédire de telles éruptions est important en particulier pour pouvoir prendre des mesures préventives de protection des installations sensibles des impulsions électromagnétiques qui en résultent.

Pour ces deux problèmes, l'apprentissage automatique, et en particulier les réseaux de neurones profonds, offrent des outils qui permettent de tirer parti de mesures qui ont été faites, et de produire des modèles qui exploitent des structures statistiques qui seraient impossibles à capturer dans une modélisation mathématique traditionnelle.

Les enjeux spécifiques et communs à ces deux champs d'applications sont la complexité des modèles sous-jacents, la quantité de données à traiter, la grande dimension des signaux de mesures, et la nécessité d'avoir des bornes statistiques de confiance.

Direct link to Lay Summary Last update: 17.09.2020

Responsible applicant and co-applicants

Employees

Associated projects

Number Title Start Funding scheme
188758 Computational Reduction for Training and Inference (CORTI) 01.03.2020 Project funding (Div. I-III)
167158 Machine Learning based Analytics for Big Data in Astronomy 01.07.2017 NRP 75 Big Data
169112 Importance sampling for large-scale unsupervised learning (ISUL) 01.03.2017 Project funding (Div. I-III)

Abstract

The over-arching goal of the RODEM project is to develop new large-scale machine-learning methods with better behavior when dealing with rare events, and provable guarantees to enhance performance of data-driven prediction and anomaly detection in the fields of High Energy Physics (HEP) and solar astronomy.These two application domains share the large scale of data generated by existing facilities, the high dimension of the signals at hand, the extreme rarity of interesting and important events, and the complexity of the underlying phenomenons. These features make a strongly data-driven approach promising for the next generation of models, and the importance of such methods is rapidly increasingin the adjacent technical domains such as microscopic imaging and material sciences.The ultimate objectives of our project are to create:(1) better forecasting tools that can be trained from very large amounts of high dimensional data with limitedsupervision,(2) better generative models that could play the role of computationally cheap surrogates for classical andcomputationally expensive simulators, and(3) anomaly detectors able to characterize out-of-distribution samples with formal guarantee to operate in asuper-rare anomaly regime.We structure the activity in the project along these three groups of challenges. To tackle them, we will first develop new theoretical tools from information theory, to generalize and frame more consistently generative and discriminative models into a common formalism, and derive finer bounds regarding the amount of information we get from training data and input signals. In parallel, we will investigate the design of computationally efficient methods relying heavily on sampling, to avoid exhaustive computation when the number of samples or the signal size increase far beyond the current limits of standard deterministic deep-learning models, given the computational capabilities of available processing hardware.Thanks to these new classes of methods, we aim at solving the above problems for two targeted application domains of remarkable importance.The first targeted application is to develop new powerful concepts for HEP. The Standard Model of particle physics (SM) is currently our best description of the world of fundamental particles and their interactions. However, the SM leaves many questions unanswered, such as the nature of dark matter, which suggests the existence of physics Beyond the SM (BSM). The Large Hadron Collider (LHC) was conceived to discover BSM physics, a task still outstanding - in contrast to CERN’s discovery of the Higgs boson in 2012. This project will develop anomaly detection to aid in this ultimate task. In addition the project aims at a unified approach to the inference problem and designing more efficient simulator surrogates, both of which crucial to approach the full exploitation of the LHC data.The second targeted application is the design a new generation of methods in the field of solar astronomy, able to better cope with the solar observational reality than the current state-of-the-art techniques. Such methods can be considered, on the one hand, as a way to extend the actual efforts in the direction of solar flare prediction. However, these methods go further, directly influencing the amount of science return from the data. A particular challenge is the generation of synthetic events for solving the class imbalance problem endemic to flare size. Even more challenging is the continuous monitoring and search of yet unknown solar events.
-