Project

Back to overview

Importance sampling for large-scale unsupervised learning (ISUL)

Applicant Fleuret François
Number 169112
Funding scheme Project funding (Div. I-III)
Research institution IDIAP Institut de Recherche
Institution of higher education Idiap Research Institute - IDIAP
Main discipline Information Technology
Start/End 01.03.2017 - 29.02.2020
Approved amount 374'251.00
Show all

Keywords (4)

Deep learning; Computer vision; Machine learning; Pattern recognition

Lay Summary (French)

Lead
Les réseaux de neurones artificiels de grandes dimensions sont devenus en quelques années la méthode la plus efficace pour extraire des informations sémantiques de signaux de grandes dimensions. Ils sont appliqués avec succès à des problèmes aussi divers que la reconnaissance d'objets ou de personnes, la reconnaissance de la parole, l'analyse du langage écrit, ou la génération d'images. Ils sont au coeur de nombreuses technologies que nous utilisons quotidiennement.Ces techniques se révèlent extrêmement efficaces lorsque la quantité de données utilisées pour les entraîner, et la puissance de calcul à disposition, sont extrêmement grandes. Ce besoin de ressources limite l'utilisation de ces techniques aux grands groupes industriels, induit indirectement un impact écologique, et finalement empêche de traiter des problèmes encore plus ambitieux.
Lay summary
Nous proposons de nous attaquer à cette problématique selon deux axes différents.

Le premier est de réduire la dépendance du coût de calcul aux données en développant des techniques capables de concentrer l'effort de calcul sur la sous-partie des exemples d'apprentissage qui sont problématiques. En pratique, on observe systématiquement que la proportion d'exemples qui ont un réel impact sur l'apprentissage d'un réseau de neurone est extrêmement réduite, et que l'essentiel du calcul se fait sur des données qui se révèlent a posteriori sans intérêt.

Notre deuxième sous-projet va s'intéresser à l'apprentissage de la structure d'un réseau de neurone elle-même, pour éviter d'avoir à définir cette structure manuellement, avec le coût très important induit par l'entraînement de multiples variantes.

Direct link to Lay Summary Last update: 16.02.2017

Responsible applicant and co-applicants

Employees

Publications

Publication
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Katharopoulos A., Vyas A., Pappas N., Fleuret F. (2020), Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention, in Proceedings of the International Conference on Machine Learning (ICML), PMLR, Boston.
Full-Gradient Representation for Neural Network Visualization
Srinivas Suraj, Fleuret Françcois (2019), Full-Gradient Representation for Neural Network Visualization, in Advances in Neural Information Processing Systems 32, 4124-4133, Curran Associates, Inc., New-York4124-4133.
Processing Megapixel Images with Deep Attention-Sampling Models
Katharopoulos Angelos, Fleuret Francois (2019), Processing Megapixel Images with Deep Attention-Sampling Models, in Proceedings of the 36th International Conference on Machine Learning, 97, 3282-3291, PMLR, New-York 97, 3282-3291.
Knowledge Transfer with Jacobian Matching
Srinivas Suraj, Fleuret Francois (2018), Knowledge Transfer with Jacobian Matching, in Proceedings of the 35th International Conference on Machine Learning, 80, 4723-4731, PMLR, New-York 80, 4723-4731.
Not All Samples Are Created Equal: Deep Learning with Importance Sampling
Katharopoulos Angelos, Fleuret Francois (2018), Not All Samples Are Created Equal: Deep Learning with Importance Sampling, in Proceedings of the 35th International Conference on Machine Learning, 80, 2525-2534, PMLR, New-York 80, 2525-2534.

Collaboration

Group / person Country
Types of collaboration
CVLab at EPFL Switzerland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
- Exchange of personnel
The Center for Machine Perception, Czech Technical University Czech Republic (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Exchange of personnel
Paul G. Allen School of Computer Science & Engineering department of the University of Washington United States of America (North America)
- in-depth/constructive exchanges on approaches, methods or results
- Publication

Associated projects

Number Title Start Funding scheme
188758 Computational Reduction for Training and Inference (CORTI) 01.03.2020 Project funding (Div. I-III)
147693 Tracking in the Wild 01.01.2014 Sinergia

Abstract

This project aims at investigating two key issues for the training of large neural networks over large scale training sets. All the developed techniques will be benchmarked on standard image classification and object-detection data-sets, on pedestrian detection and re-identification, and on controlled data-sets produced over the course of the project.We structure this proposal in two sub-projects. The first will investigate a general strategy based on importance sampling to deal with very large training sets. While the current common approach to learning is the stochastic gradient descent, very few efforts have been invested in the choice of the ``re-sampling'' strategy. Virtually every state-of-the-art method uses a uniform visit of the samples over the learning, without prioritizing according to computation done previously.Our objective is to develop a general framework to apply importance-sampling to gradient-descent and other optimization schemes so that they concentrate the computational effort over problematic and informative samples.The second sub-project will investigate a central, and currently poorly addressed, practical issue when using large neural networks, which is the meta-optimization of the network structure itself. We are interested in studying approaches to avoid the requirement for the standard and computational intensive combination of meta-parameter grid-search and hand-tuning.
-