22. Adaptive Optimal Control Approaches to Sensorimotor Learning

Daniel A. Braun1,2 dab54@cam.ac.uk Ad Aertsen2 aertsen@biologie.uni-freiburg.de Daniel M. Wolpert1 wolpert@eng.cam.ac.uk Carsten Mehring2 mehring@biologie.uni-freiburg.de

1Department of Engineering, University of Cambridge, Cambridge, UK
2Bernstein Center for Computational Neuroscience, Albert-Ludwigs-University, Freiburg, Germany

Recently it has been shown that it is possible to explain a wide range of motor psychophysical findings on the basis of stochastic optimal feedback control. Here we extend the optimal control framework to allow for adaptive responses to environmental changes. In order to compute an optimal action an optimal feedback controller requires an internal model F of the dynamics of the environment such that consecutive states x and the motor command u are connected by xt+1 = F(xt,u). In learning experiments this transition function can depend on additional parameters at that change over time, so that xt+1 = F(at,xt,u), e.g. changing loads attached to the arm etc. From a theoretical point of view, the adaptive control problem has to learn to solve two problems: The first is the structural learning problem that is learning the structure of the task F(), e.g. the class of visuomotor rotation or gain changes. The second is the parametric learning problem, that is finding the unknown parameters at, such as the particular setting of a rotation or gain.

In order to test experimentally for structural learning we exposed human subjects to a task with a fixed structure F() which can have different parameterisations at. Importantly the parameters for the task change randomly between blocks of trials making the task impossible to learn, although it is possible for subjects to learn the structure which remains fixed over the trials. In one of the experiments, we exposed subjects to randomly varying 3d rotations where the rotation angle was drawn from a uniform distribution [-60,+60] every four trials. One group of subjects exclusively experienced random rotations around the vertical and the other group around the horizontal axes. Later in the experiment we introduced blocks of probe trials with rotations around either axis that were identical for both groups. Interestingly, both groups reacted very differently to the same trials. They showed structure-specific facilitation, variability patterns and exploration strategies. Once the structure of the environmental change is known, optimal adaptive routines can be established to respond to them. These parametric adaptive responses can be computed (approximately) by adaptive optimal control methods. We tested such an adaptive linear quadratic control model in a visuomotor rotation experiment where the rotation angle changed randomly every trial so that subjects had to adapt online in order to hit the targets. The model’s predicted adaptive behaviour was consistent with the experimentally observed kinematics and variability patterns.