Gatsby Unit | Research

GATSBY COMPUTATIONAL NEUROSCIENCE UNIT

Two issues in neural value-decision making: emulating the other’s decision processes, and time representations

Hiro Nakahara
Laboratory for Integrated Theoretical Neuroscience RIKEN Brain Science Institute

Two issues in neural value-decision making: emulating the other’s decision processes, and time representations

I plan to present our two recent works in the fields of neural value-based decision making. The first topic concerns value-based decision making in social contexts, in which predicting another’s decision making is often indispensable for making one’s own decision. How does our brain learn and emulate the other’s value-based decision making, so as to incorporate it into one’s own value-based decision making? To address this issue, we have been conducting a model-based fMRI study and I will summarize the results of this study (unpublished). In brief, we found that two different errors, emulated-other’s reward and action prediction errors, contribute to learning for emulating the other; the errors are respectively correlated with neural signals in the ventromedial prefrontal and dorsolateral/dorsomedial prefrontal cortex. The second topic is to refine the frameworks of reinforcement learning, or temporal difference (TD) learning, for neural valuation processes. Although the TD learning becomes a major paradigm for understanding value-based decision making and related neural activities (e.g., dopamine activity), the representation of time in neural processes modeled by a TD framework is poorly understood. I will present the formulation and possible consequences of a TD model (called internal-time TD model) that treats time of operator/neural valuation process separately from time of observer/experiments. The separation of the two times allows us to uncover so called operator-observer problems that intrinsically exist when applying TD models to neural value-based decision making: e.g., different expressions of TD error dependent upon both the time frame and time unit, and co-appearance of exponential and hyperbolic discounting at different delays in intertemporal choice tasks.