Back to overview

Unupervised Learning of Interactions from Real Data

English title Unupervised Learning of Interactions from Real Data
Applicant Favaro Paolo
Number 188690
Funding scheme Project funding (Div. I-III)
Research institution Institut für Informatik Universität Bern
Institution of higher education University of Berne - BE
Main discipline Information Technology
Start/End 01.09.2020 - 31.08.2023
Approved amount 749'852.00
Show all

All Disciplines (2)

Information Technology

Keywords (5)

self-supervised learning; transfer learning; unsupervised learning; learning interactions; disentangling factors of variation

Lay Summary (Italian)

La costruzione di agenti che imparano da soli direttamente dai dati che osservano e' uno degli obbiettivi fondamentali in machine learning. In contrasto al campo del reinforcement learning, dove gli agenti possono anche interagire con l'ambiente, qui siamo interessati a cosa si puo' imparare in modo passivo, solamente attraverso l'osservazione.Gli ultimi progressi fatti in machine learning suggeriscono che si puo' imparare moltissimo in questa modalita', che ha il vantaggio di essere estremamente pratica e realistica da implementare, perche' non richiede l'uso di robot o l'uso di ambienti simulati.
Lay summary
Soggetto e obbiettivo
L'obbiettivo di questo progetto e' di costruire degli agenti capaci di operare nel mondo reale (in altre parole, capaci di coordinare azioni, come la navigazione attorno ad ostacoli o di manipolare oggetti) senza diretta supervisione. Nel nostro contesto, un agente e' una macchina che puo' osservare dati attraverso vari sensori (per es., telecamere e microfoni) e che puo' fare azioni come cambiare la sua posizione oppure interagire con oggetti. Siamo interessati a costruire agenti che possono sviluppare queste abilita' imparando dalla osservazione indiretta e passiva delle interazioni di altri agenti da dati reali (e quindi non simulati).

Contesto socio-scientifico
Ci si aspetta che questi agenti sviluppino autonomamente l'abilita' di riconoscere e localizzare oggetti, di riconoscere e predire le loro dinamiche e di riconoscere le loro interazioni.
Questo rendera' possibile l'automatizzazione di lavori che al momento richiedono una costosa annotazione manuale o che non possono essere annotati a causa di confidenzialita' dei dati (banche, assicurazioni, militari) o della mancanza di esperti (nel campo medico).

Direct link to Lay Summary Last update: 24.05.2020

Responsible applicant and co-applicants


Associated projects

Number Title Start Funding scheme
169622 Analysis and Design of Self-Supervised Learning Methods 01.04.2017 Project funding (Div. I-III)


Our ultimate goal is to build an agent that learns to operate in the real world (i.e., to plan actions, such as navigation around obstacles or to grab objects) without direct supervision. In our own definition, an agent is a machine that can observe data through several sensors (e.g., cameras and microphones) and that can perform actions such as changing its position or interacting with objects. An agent can learn either by direct interaction or through passive observation of the interactions between other agents. In this proposal we focus on the latter learning strategy. We investigate methods for an agent to build a representation of the observed data, i.e., an internal model of the environment, the agents in it and their interactions. In contrast to related work we do not learn from simulated data, where the agent is aware of its own actions, but directly from real data. The agent learns this representation by making and validating predictions of the observations with it. In particular, the learned representation aims to predict the consequences of actions and to determine what actions are needed to produce changes to other agents.To tackle this goal, a first aspect is to choose the data to learn from. Unfortunately, collecting examples with labels, which are needed to train with supervised learning methods, does not seem a viable solution. Manual annotation of objects and actions is quite costly, error-prone, time-consuming, may be ill-defined, and may introduce undesired bias into the training. A second aspect is whether the proposed “passive” learning is even possible and thus if one should instead focus on learning through direct interaction, i.e., with a physical agent (a robot) or a simulated one (e.g., in a videogame). Using a robot to learn through direct interaction with the real world is challenging, because the interaction process requires either a very long time (physics and technology limit the speed of operation of a robot) or working with several robots in parallel, which is costly. Working with simulations faces instead limitations due to the gap between the simulated and the real environments. Moreover, current research in self-supervised learning, representation learning and disentangling of factors of variations shows that passive learning is possible and its full potential is still largely untapped.Our proposed approach is to make an agent learn about objects and other agents through passive observation of their interactions. Our approach is to use existing datasets of real images and video sequences and to learn representations of the environment (e.g., global attributes such as illumination and the point of view of the agent), of the objects (attributes such as their location, pose, category, 3D surface, appearance - texture, materials -) and of the actions (e.g., actions can be associated to changes of the object attributes). We expect that solving the above objectives will have a strong impact in both science and industry. Building object representations that will enable the detection, prediction and learning from object interactions without human annotation has the potential to solve machine learning problems at a large scale without data privacy concerns (e.g., in the medical and military fields, in the banking and insurance industry).