
Deciding when to decide
Peter Latham
Gatsby Computational Neuroscience Unit, UCL , UK
Decision making in an uncertain world is a hard problem: do we make a decision now based on what we know, or do we wait and gather more information? Both have their costs, and the (hard) computational problem for the brain is to find the right balance.
In a very simple decisionmaking task  the random dot kinematogram
 subjects have to decide whether a set of dots is moving to the right or left. The problem here is to decide not only what decision to make, but also when to make it. In general the latter problem is
harder: should one wait and gather more information, thus increasing the probability of being right but potentially reducing the reward per unit time, or should one make a decision immediately, thus decreasing the probability of being right but potentially increasing the reward per unit time?
This is a classic reinforcement learning problem: the subjects must determine the optimal policy, where, at any given time, the policy is whether to make a decision or continue to gather information. In the brain, motion direction is thought to be coded in area MT, so the incoming information upon which decisionmaking is based is MT activity. Assuming that the subjects are optimizing reward per unit time, we present a formulation that can be used to find the optimal policy. We then apply that formalism under several assumptions about the statistics of MT neurons and their tuning to direction. We find that the optimal policy is nonstationary, although a stationary policy is not far from optimal.