Projekt

Zurück zur Übersicht

State representation in reward based learning -- from spiking neuron models to psychophysics

Titel Englisch State representation in reward based learning -- from spiking neuron models to psychophysics
Gesuchsteller/in Gerstner Wulfram
Nummer 122697
Förderungsinstrument Sinergia
Forschungseinrichtung EPFL - IC - ISIM - LCN
Hochschule EPF Lausanne - EPFL
Hauptdisziplin Informatik
Beginn/Ende 01.01.2009 - 31.08.2012
Bewilligter Betrag 1'481'614.00
Alle Daten anzeigen

Keywords (11)

computational neuroscience; spiking neurons; learning; reinforcement learning; decision making; perceptual learning; psychophysics; memory; behavior; neurons; synaptic plasticity

Lay Summary (Englisch)

Lead
Lay summary
Human and animals learn by changing the strength of connections between neurons. Suppose we have to learn to navigate through a complex maze - a labyrinth often found in the gardens of old castles. In this case we need to learn that at the first bifurcation we need to turn left, at the second bifurcation right, at the crossing a sharp left-turn and so on.
Suppose now that different locations correspond to different `place' neurons and turning left and right to two other populations of neurons.
If we strengthen the connection between the neurons coding for the first bifurcation, and those cells coding for left-turn, then the combination `turn left at first bifurcation' becomes more likely.
We study in this project in models whether there is an optimal way of changing the connections upon a succesful action - and how this could be implemented in biologically plausible neuronal models.
Direktlink auf Lay Summary Letzte Aktualisierung: 21.02.2013

Verantw. Gesuchsteller/in und weitere Gesuchstellende

Mitarbeitende

Publikationen

Publikation
Different types of feedback change decision criterion and sensitivity differently in perceptual learning
(2012), Different types of feedback change decision criterion and sensitivity differently in perceptual learning, in JOURNAL OF VISION, 12(3), 1-11.
Perceptual learning of motion discrimination by mental imagery
(2012), Perceptual learning of motion discrimination by mental imagery, in JOURNAL OF VISION, 12(6), 1-10.
Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams
(2012), Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams, in NEURAL COMPUTATION, 24(11), 2994-3024.
Spike-based Decision Learning of Nash Equilibria in Two-Player Games
(2012), Spike-based Decision Learning of Nash Equilibria in Two-Player Games, in PLoS Comput Biol., 8(9), e1002691-e1002691.
Personality traits in rats predict vulnerability and resilience to developing stress-induced depression-like behaviors, HPA axis hyper-reactivity and brain changes in pERK1/2 activity
(2012), Personality traits in rats predict vulnerability and resilience to developing stress-induced depression-like behaviors, HPA axis hyper-reactivity and brain changes in pERK1/2 activity, in PSYCHONEUROENDOCRINOLOGY, 37(8), 1209-1223.
About similar characteristics of visual perceptual learning and LTP
(2012), About similar characteristics of visual perceptual learning and LTP, in VISION RESEARCH, 61, 100-106.
Perceptual learning, roving and the unsupervised bias
(2012), Perceptual learning, roving and the unsupervised bias, in VISION RESEARCH, 61, 95-99.
Vulnerability of conditional NCAM-deficient mice to develop stress-induced behavioral alterations
(2012), Vulnerability of conditional NCAM-deficient mice to develop stress-induced behavioral alterations, in STRESS-THE INTERNATIONAL JOURNAL ON THE BIOLOGY OF STRESS, 15(2), 195-206.
Paradoxical Evidence Integration in Rapid Decision Processes
(2012), Paradoxical Evidence Integration in Rapid Decision Processes, in PLOS COMPUTATIONAL BIOLOGY, 8(2), 1-10.
Gradient estimation in dendritic reinforcement learning
(2012), Gradient estimation in dendritic reinforcement learning, in The Journal of Mathematical Neuroscience, 2(2), 1-19.
Intrinsically Motivated NeuroEvolution for Vision-Based Reinforcement Learning
(2011), Intrinsically Motivated NeuroEvolution for Vision-Based Reinforcement Learning, in 2011 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING (ICDL), 1-7.
Variational Learning for Recurrent Spiking Networks
(2011), Variational Learning for Recurrent Spiking Networks, in NIPS 2011 Proceedings, 1-9.
Evidence for a Role of Oxytocin Receptors in the Long-Term Establishment of Dominance Hierarchies
(2011), Evidence for a Role of Oxytocin Receptors in the Long-Term Establishment of Dominance Hierarchies, in NEUROPSYCHOPHARMACOLOGY, 36(11), 2349-2356.
Social memories in rodents: Methods, mechanisms and modulation by stress
(2011), Social memories in rodents: Methods, mechanisms and modulation by stress, in Neurosci Biobehav Rev, 36(7), 1762-1773.
A Peptide Mimetic Targeting Trans-Homophilic NCAM Binding Sites Promotes Spatial Learning and Neural Plasticity in the Hippocampus
(2011), A Peptide Mimetic Targeting Trans-Homophilic NCAM Binding Sites Promotes Spatial Learning and Neural Plasticity in the Hippocampus, in PLOS ONE, 6(8), 1-13.
Neural mechanisms and computations underlying stress effects on learning and memory
(2011), Neural mechanisms and computations underlying stress effects on learning and memory, in CURRENT OPINION IN NEUROBIOLOGY, 21(3), 502-508.
Spatio-Temporal Credit Assignment in Neuronal Population Learning
(2011), Spatio-Temporal Credit Assignment in Neuronal Population Learning, in PLOS COMPUTATIONAL BIOLOGY, 7(6), 1-13.
Glucocorticoids act on glutamatergic pathways to affect memory processes
(2011), Glucocorticoids act on glutamatergic pathways to affect memory processes, in TRENDS IN NEUROSCIENCES, 34(4), 165-176.
Slow Feature Analysis
(2011), Slow Feature Analysis, in Scholarpedia, 6(4), 5282-5282.
Stress during Adolescence Increases Novelty Seeking and Risk-Taking Behavior in Male and Female Rats
(2011), Stress during Adolescence Increases Novelty Seeking and Risk-Taking Behavior in Male and Female Rats, in Front Behav Neurosci, 5(17), 1-10.
Does Perceptual Learning Suffer from Retrograde Interference?
(2010), Does Perceptual Learning Suffer from Retrograde Interference?, in PLOS ONE, 5(12), 1-6.
Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity
(2010), Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity, in JOURNAL OF NEUROSCIENCE, 30(40), 13326-13337.
Learning under stress: the inverted-U-shape function revisited
(2010), Learning under stress: the inverted-U-shape function revisited, in Learn Mem, 17(10), 522-530.
Learning Spike-Based Population Codes by Reward and Population Feedback
(2010), Learning Spike-Based Population Codes by Reward and Population Feedback, in NEURAL COMPUTATION, 22(7), 1698-1717.
Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail
(2009), Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail, in PLOS COMPUTATIONAL BIOLOGY, 5(12), 1-17.
Code-Specific Policy-Gradient Rules for Spiking Neurons
(2009), Code-Specific Policy-Gradient Rules for Spiking Neurons, in Advances in Neural Information Processing Systems , 22, 1741-1749.
Interleaving bisection stimuli - randomly or in sequence - does not disrupt perceptual learning, it just makes it more difficult
(2009), Interleaving bisection stimuli - randomly or in sequence - does not disrupt perceptual learning, it just makes it more difficult, in VISION RESEARCH, 49(21), 2591-2598.
Stress, genotype and norepinephrine in the prediction of mouse behavior using reinforcement learning
(2009), Stress, genotype and norepinephrine in the prediction of mouse behavior using reinforcement learning, in NATURE NEUROSCIENCE, 12(9), 1180-1180.
Modeling perceptual learning: Why mice do not play backgammon
(2009), Modeling perceptual learning: Why mice do not play backgammon, in Learning & Perception, 1(1), 155-163.
Reinforcement learning in populations of spiking neurons
(2009), Reinforcement learning in populations of spiking neurons, in NATURE NEUROSCIENCE, 12(3), 250-252.
Code-specific synaptic plasticity improves learning
, Code-specific synaptic plasticity improves learning, in The Journal of Neuroscience.
Human learning in non-Markovian decision making
, Human learning in non-Markovian decision making, in PLoS Computation Biology.

Verbundene Projekte

Nummer Titel Start Förderungsinstrument
145004 In vivo fast-scan cyclic voltammetry detection of neurotransmitters: A focus on dopamine 01.01.2013 R'EQUIP
108102 The role of the neural cell adhesion molecule in stress-induced cognitive and neural disturbances 01.04.2005 Projektförderung (Abt. I-III)
133853 A phenogenomic approach to identify novel determinants of mitochondrial function 01.10.2011 R'EQUIP
147636 Learning from delayed and sparse feedback 01.12.2013 Sinergia
135710 Stress and the Social Brain: The role of neuropeptides and synapse-specific neuroplasticity molecules 01.04.2011 Projektförderung (Abt. I-III)
117975 Coding Characteristics of Neuron Models 01.10.2007 Projektförderung (Abt. I-III)
114404 Top-down and bottom-up processes in perceptual learning 01.01.2007 ProDoc (Forschungsmodul, FM)
113364 Theory and Practice of Reinforcement Learning 01.02.2007 Projektförderung (Abt. I-III)
133094 Dendritic pointers and time multiplexing as cortical binding mechanisms 01.05.2011 Projektförderung (Abt. I-III)

Abstract

Reward-based learning encompasses a broad class of algorithms in the field of machine learning that allow to optimize the behavior of an agent (e.g. of a real or simulated robot) so as to maximize the total expected reward. These algorithms describe learning in machines that is reminiscent of learning in animals or humans as studied in animal behavior (e.g. conditioning) or human psychophysics. Learning in humans or animals in turn is thought to be related to changes in synaptic connections between neurons in the brain. Hence the question arises whether models of synaptic plasticity on the level of spiking neurons can be connected to formal `reinforcement' learning models in machine learning and to human psychophysics and animal behavior.This project combines the expertise from two laboratories in computational neuroscience (EPFL-LCN/Wulfram Gerstner and Univ. Berne/Walter Senn) who have both previously worked on spike-based models of synaptic plasticity, with the machine learning expertise of the Schmidhuber group at IDSIA (Lugano) who have a long-standing track record in formal models of reinforcement learning, with the psychophysics laboratory of Michael Herzog (EPFL-LPSY) who has a long tradition in human vision and perceptual learning, and with the rodent behavior expertise of Carmen Sandi (EPFL-BMI).
-