Course Schedule
You'll probably have to click forward until you get to the right week (May 26th - May 30th 2008) to see the timetable...
Projects
Andrew Barto: Reinforcement Learning
Projects with Andrew Barto will focus on pure reinforcement learning. Projects will start by exploring the strengths and weaknesses of different reinforcement learning algorithms, and then procede to more advanced topics. Examples can be found here.
Nathaniel Daw: Multiplayer games
Students will collect data from pairs of participants repeatedly playing rounds of "work-or-shirk" (a game reminiscent of rock-paper-scissors; see Dorris & Glimcher, Neuron 2005). By varying the payoffs participants expect for different outcomes, students can test whether the participants' overall strategy adjustments follow those predicted by classical game theory. Secondly, students may also examine trial-by-trial strategic adjustments, by fitting reinforcement learning models to the data (as in Hampton et al, PNAS 2008, Camerer & Ho, Econometrica 1999).
Anthony Dickinson and Simon Killcross: Habits and actions
Students will be running a learning experiments and gather data on the difference between habitual and goal-directed learning in humans. Time permitting, these data will be fitted by RL models, and the quality of fits to goal-directed and habitual parts of the task compared.
Michael Frank: Dopamine receptors
Projects with Michael Frank will concentrate on fitting reinforcement learning models to human data from probabilistic learning tasks. The data paper) come from both healthy controls and people with Parkinson's disease, and allow an investigation of the effects of the consequences of dopamine dysfunction. In addition, students will explore the effects of D1 and D2 receptors in the context of this data and a detailed, biophysically realistic model.
Adam Kepecs: How uncertainty boosts learning
According to statistical learning theory the rate of learning should depend on the current estimate of uncertainty: learn more when uncertain and less when certain. Students will explore various aspects of these theories by running psychophysical experiments and comparing the results to rat behavioral data, focusing on the trial-by-trial updating of decision strategies. Additionally, students can examine how simple extensions of reinforcement learning models can account for their data.
Yael Niv: The effects of noise on temporal difference learning.
Temporal difference (TD) learning is by now well-ingrained into our
thinking about the role of dopamine in learning. However, TD models
usually assume a fully observable state space, which is known to be
an unrealistic simplification. In this project we will examine the
effects of different sources of noise on TD learning. We will
consider external noise (probabilistic rewards, as in Nakahara et al.
(2004), Morris
et al. (2004) and Fiorillo et
al. (2003)), internal
noise as a result of a noisy representation, and most importantly --
timing noise which is inherent in most learning scenarios (see the
first part of Gallistel & Gibbon (2000)'s "Time, rate and
conditioning" for a comprehensive review).
Suggested directions:
- Investigate robustness of tapped-delay line TD to each source of
noise and compare to available data.
Niv Y., Duff M.O. and Dayan P. -- The effects of uncertainty on TD learning [pdf]; Niv Y., Duff M.O. and Dayan P. (2005) -- Dopamine, uncertainty and TD learning [pdf] - (More advanced) Incorporate a semi-Markov framework and
investigate scalar timing noise.
Daw N.D., Courville A.C. and Touretsky D.S. (2003) -- Timing and partial observability in the dopamine system [pdf]; Daw N.D., Courville A.C. and Touretsky D.S. (2002) -- Dopamine and inference about timing [pdf]; Gibbon J., (1977) -- Scalar expectancy theory and Weber's Law in animal timing -- Psychological Review, 84, 279-325.;
Angela Yu: Neuromodulation and uncertainty
This project will concentrate on how the neurophysiological effects of ACh/NE at the cellular level can carry out computations at the systems level (signaling different kinds of uncertainty: expected and unexpected uncertainty, see Yu and Dayan 2005). Angela Yu's recent work on adaptive responding to a changing world provides a context in which one can think about this concretely. It turns out the necessary computations can be implemented approximately by leaky integrator neurons, and ACh/NE can play the roles of adjusting the relative weights of the feedforward and recurrent terms, which correspond at a systems level to their distinct uncertainty roles, and at a computational level to changing the time constant of integration over time.
Course content / syllabus
Monday :: Introduction to Animal Learning
Anthony Dickinson, Cambridge University: The psychology of animal learning
Lecture 1
- Pavlovian Conditioning
- Instrumental (operant) conditioning
- Associative learning processes
Lecture 2
- Distinguishing actions from habits
- Pavlovian-Instrumental transfer
- Associative-cybernetic model
- Dual-process theories
Slides for Conditioning and Learning class and for Conditioning and Behaviour class.
Simon Killcross, University of Cardiff: The neurobiology of animal learning
Lecture 1 Pavlovian learning
- Amygdala: aversive and appetitive
- Neurobiology of omission schedules
- Preparatory and consummatory conditioning
- Neurobiology of associative processes
Lecture 2: Instrumental learning
- Neurobiology of habits and goal-directed actions
- Neural dissociations of outcome specific and general PIT
Tuesday :: Introduction to Reinforcement Learning (RL)
Andrew Barto, UMass Amherst
Lecture 1: basics
- The reinforcement learning problem
- Value functions / Policies
- Bellman equation
- Evaluating a decision tree: strengths and challenges
- Dynamic programming: value of a fixed policy
- Monte Carlo approaches: TD learning and Q learning
Lecture 2: advanced topics
- Generalisation and the use of neural networks
- Representation
- Multiple controllers
- Direct policy methods
Slides for Lecture 1 and Lecture 2.
Wednesday :: RL and neuromodulation
Yael Niv, Princeton University
Dopamine: from anhedonia to motivation
- Rewards: The TD story
- Dopamine and punishments
- Wanting vs liking
- Motivation
Slides for the lecture.
Angela Yu, Princeton University
Acetylcholine and Norepinephrine
- Attention and Learning: the role of uncertainty
- ACh & NE: expected and unexpected uncertainty
Quentin Huys, Columbia University
Serotonin
- 5HT and reflexive actions
- 5HT and temporal discounting
- 5HT and DA in psychiatry
Slides for the lecture.
Thursday :: Neuroeconomics and multiplayer games
Nathaniel Daw, NYU
Lecture 1: The neurobiology of reinforcement learning
- values from revealed preference
- actors and critics in the brain
- a single value function?
- habits vs goal-directed behaviour
- beyond habits: model-based decisions
- beyond TD: non-normative choices
Lecture 2: Mutliplayer games and social effects
- Introduction to game theory
- Modelling the world
- Reinforcement learning in a social context
Friday :: RL and psychiatry mini-symposium
Quentin Huys, Columbia University
Introduction
Adam Kepecs, Cold Spring Harbor Laboratories
Anxiety: the role of uncertainty
- Animal models of anxiety
- Fear conditioning
- The effects of uncertainty
Quentin Huys, Columbia University
Introduction: computational psychiatry
Depression
- Anhedonia
- Learned helplessness
- Modelling and testing learned helplessness in humans
Slides for the lecture.
Michael Moutoussis, SW London & St. George's Mental Health NHST.
Schizophrenia
- Severe mental illness, including Schizophrenia
- Explanatory theories of psychosis
- Paranoid psychosis
- Psychological and Computational models of psychosis.
Jonathan Williams, Institue of Psychiatry
ADHD
- impulsivity / discounting: DA and 5HT
- modelling in psychiatry...?
Michael Frank, University of Arizona
Parkinson's Disease
- DA in Parkinson's
- Receptor-specific effects of dopamine
- Learning from rewards and punishments
Slides for the lecture.