Philipp Hennig
(Research Scientist, Department of Empirical Inference Max Planck Institute for Intelligent Systems Tübingen, Germany)
Wednesday 28th September 2011
16.00pm
Seminar Room B10 (Basement)
Alexandra House, 17 Queen Square, London, WC1N 3AR
Optimal Reinforcement Learning for Gaussian Systems
The exploration-exploitation tradeoff is among the central challenges of reinforcement learning. The exact Bayesian learner providing the optimal solution is intractable in general. Like in other inference tasks, though, convenient prior assumptions can allow exact statements. In this talk, I will show that, in the case of Gaussian process inference, it is possible to make analytic statements about optimal learning of both loss function and transition dynamics of nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics. The main result is a theoretical insight, but I will also show some approximate numerical results.