Back to overview

Robot skills acquisition through active learning and social interaction strategies (ROSALIS)

English title Robot skills acquisition through active learning and social interaction strategies (ROSALIS)
Applicant Odobez Jean-Marc
Number 172627
Funding scheme Project funding (Div. I-III)
Research institution IDIAP Institut de Recherche
Institution of higher education Idiap Research Institute - IDIAP
Main discipline Information Technology
Start/End 01.04.2018 - 31.03.2022
Approved amount 740'766.00
Show all

Keywords (8)

Human-Robot Interaction; Gesture; Active learning; Attention; Visual Perception; Skill Transfer; Behavior; Learning by Demonstration

Lay Summary (French)

Apprendre des gestes à un robot via des interactions sociales
Lay summary

Dans le projet ROSALIS, les interactions sociales seront exploitées pour apprendre des gestes et des tâches à un robot. Ces interactions prendront la forme de requêtes du robot pour avoir une démonstration, de questions posées à propos du geste à apprendre, de tentatives de reproduction dans des situations de plus en plus complexes qu’il a élaboré pour montrer et valider ce qu’il a appris.


La recherche avancera sur plusieurs points. Premièrement, la représentation des gestes comprendra différents niveaux de plasticité permettant d'ajuster les propriétés du geste ou de le figer définitivement comme élément d'un répertoire une fois sa reproduction maîtrisée.  Cela se fera grâce à une formalisation mathématique de la représentation des éléments invariants du geste. De plus, l'apprentissage du robot se fera par une approche active lui permettant d'élaborer des hypothèses sur les propriétés du geste par analyse des informations reçues (reproductions du geste, réponses aux questions,...) et de suggérer des questions ou démonstrations pour supprimer les ambiguïtés.


Deuxièmement, pour rendre les interactions naturelles, des algorithmes de perception seront élaborés pour interpréter et segmenter temporellement les comportements et intentions de la personne, par analyse de différents éléments multimodaux d’interaction. Représentation et perception seront intégrés dans la définition d'unités d'interaction, impliquant la synthèse de différentes démonstrations, des requêtes et des signaux sociaux sous forme verbale ou non-verbale, ainsi que la sélection des unités facilitant un apprentissage progressif.


Ces travaux trouveront des applications de robotique de service et industrielle, des secteurs qui requièrent la capacité de reprogrammer des robots de manière efficace et personnalisée.

Direct link to Lay Summary Last update: 25.03.2018

Responsible applicant and co-applicants


Associated projects

Number Title Start Funding scheme
164022 Platform for Reproducible Acquisition, Processing, and Sharing of Dynamic, Multi-Modal Data 01.07.2016 R'EQUIP
180445 Heap: Human-Guided Learning and Benchmarking of Robotic Heap Sorting 01.05.2019 CHIST-ERA


Most efforts in robot learning from demonstration are turned toward developing algorithms for the acquisition of specific skills from training data. While such developments are important, they often do not take into account the social structure of the process, in particular, that the interaction with the user and the selection of the different interaction steps can directly influence the quality of the collected data. Similarly, while skills acquisition encompasses a wide range of social and self-refinement learning strategies, including mimicking (without understanding the objective), goal-level emulation (discovering the objectives by discarding the specific way in which a task is achieved), exploration with self-assessed rewards or feedback from the users, they each require the design of dedicated algorithms, but the ways in which they can be organized have been overlooked so far.In ROSALIS, we propose to rely on natural interactions for skill learning, defined as an open-ended sequence of pragmatic frames, which are recurrent natural negotiated protocols which have emerged over time, involving queries about the skills and answers, including demonstrations made by both the human and the robot to show what it has learned. The research will advance on several fronts.First, for skills representation, the robot learners will require an appropriate level of plasticity, allowing them to adapt and refine a skill primitive currently being learned, as well as to freeze a skill primitive as part of a repertoire once the skill is mastered. Learning plasticity will be explored and mathematically formalized as a statistical invariance extraction problem that is concurrently applied to several levels of representation. Furthermore, active learning methodologies will be developed, relying on heterogeneous sources of information (demonstrations, feedback labels, properties), allowing to make hypotheses about the skill invariants and to suggest demonstrations or queries.Secondly, to allow natural interactions, we will design perception algorithms to provide a higher level understanding of people behaviors and intentions, including gaze information and multimodal action recognition and segmentation. The different mechanisms will be integrated within the definition of the pragmatic frames and will imply the coordination (selection, timing) between different components: (i) real-time interpretation of the different multimodal inputs; (ii) synthesis of different demonstrations (primitives, partial or full instance gestures), as well as queries and social signals expressed through verbal (questions, grounding) and non-verbal behaviors (audio backchannels, head gestures and nodding, gaze behaviors); (iii) selection of the interaction units to build scaffolded interactions, exploiting hypotheses about the skill and allowing the system to combine different learning strategies.We target applications of robots in both manufacturing and home/office environments, both requiring re-programming in an efficient and personalized manner.