Gatsby Computational Neuroscience Unit


Marc Deisenroth


Wednesday 10th January 2018


Time: 4.00pm


Ground Floor Seminar Room

25 Howland Street, London, W1T 4JG


Data-Efficient Learning for Autonomous Robots

Trial-and-error based reinforcement learning (RL) has seen rapid
advancements in recent times, especially with the advent of deep neural
networks. However, the majority of autonomous RL algorithms either rely
on engineered features or a large number of interactions with the
environment. Such a large number of interactions may be impractical in
many real-world applications. For example, robots are subject to wear
and tear and, hence, millions of interactions may change or damage the
To address this problem, current learning approaches typically require
task-specific knowledge in form of expert demonstrations, pre-shaped
policies, or the underlying dynamics. In the first part of the talk, I
follow a different approach and speed up learning by efficiently
extracting information from sparse data. In particular, we propose to
learn a probabilistic, non-parametric Gaussian process dynamics model.
By explicitly incorporating model uncertainty in long-term planning and
controller learning my approach reduces the effects of model errors, a
key problem in model-based learning. Compared to state-of-the art
reinforcement learning our model-based policy search method achieves an
unprecedented speed of learning, which makes is most promising for
application to real systems. We demonstrate its applicability to
autonomous learning from scratch on real robot and control tasks. To
reduce the number of system interactions while naturally handling state
or control constraints, we extend the above framework and propose a
model-based RL framework based on Model Predictive Control (MPC) using
learned probabilistic dynamics models. We provide theoretical guarantees
for the first-order optimality in the GP-based transition models with
deterministic approximate inference for long-term planning. The proposed
framework demonstrates superior data efficiency and learning rates
compared to the current state of the art.

Key references:

[1] Marc P. Deisenroth, Dieter Fox, Carl E. Rasmussen, Gaussian
Processes for Data-Efficient Learning in Robotics and Control, IEEE
Transactions on Pattern Analysis and Machine Intelligence, volume 37,
pp. 408–423, 2015

[2] Sanket Kamthe, Marc P. Deisenroth, Data-Efficient Reinforcement
Learning with Probabilistic Model Predictive Control, AISTATS 2018