Models of Hippocampally Dependent Navigation
using the Temporal Difference Learning Rule
David Foster   Richard Morris   Peter Dayan
Hippocampus, 10 :1-16.
Abstract
This paper presents a model of how hippocampal place cells might be
used for spatial navigation in two watermaze tasks: the standard reference
memory task and a delayed matching-to-place task. In the reference memory
task, the escape platform occupies a single location and rats gradually
learn relatively direct paths to the goal over the course of days, in each
of which they perform a fixed number of trials. In the delayed
matching-to-place task, the escape platform occupies a novel location on
each day, and rats gradually acquire one-trial learning, i.e., direct paths
on the second trial of each day. The model uses a local, incremental, and
statistically efficient connectionist algorithm called temporal difference
learning in two distinct components. The first is a reinforcement-based
"actor-critic" network that is a general model of classical and
instrumental conditioning. In this case, it is applied to navigation, using
place cells to provide information about state. By itself, the actor-critic
can learn the reference memory task, but this learning is inflexible to
changes to the platform location. We argue that one-trial learning in the
delayed matching-to-place task demands a goal-independent representation of
space. This is provided by the second component of the model: a network
that uses temporal difference learning and self-motion information to
acquire consistent spatial coordinates in the environment. Each component
of the model is necessary at a different stage of the task; the
actor-critic provides a way of transferring control to the component that
performs best. The model successfully captures gradual acquisition in both
tasks, and, in particular, the ultimate development of one-trial learning
in the delayed matching-to-place task. Place cells report a form of stable,
allocentric information that is well-suited to the various kinds of
learning in the model.
pdf