Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
Department of Neurobiology and Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA 94305, USA.
Prog Neurobiol. 2020 Dec;195:101881. doi: 10.1016/j.pneurobio.2020.101881. Epub 2020 Jul 3.
The consequences of individual actions are typically unknown until well after they are executed. This fact necessitates a mechanism that bridges delays between specific actions and reward outcomes. We looked for the presence of such a mechanism in the post-movement activity of neurons in the frontal eye field (FEF), a visuomotor area in prefrontal cortex. Monkeys performed an oculomotor gamble task in which they made eye movements to different locations associated with dynamically varying reward outcomes. Behavioral data showed that monkeys tracked reward history and made choices according to their own risk preferences. Consistent with previous studies, we observed that the activity of FEF neurons is correlated with the expected reward value of different eye movements before a target appears. Moreover, we observed that the activity of FEF neurons continued to signal the direction of eye movements, the expected reward value, and their interaction well after the movements were completed and when targets were no longer within the neuronal response field. In addition, this post-movement information was also observed in local field potentials, particularly in low-frequency bands. These results show that neural signals of prior actions and expected reward value persist across delays between those actions and their experienced outcomes. These memory traces may serve a role in reward-based learning in which subjects need to learn actions predicting delayed reward.
个体行为的后果通常在执行后很久才会显现。这一事实需要有一种机制来弥合特定行为和奖励结果之间的延迟。我们在额叶眼区(FEF)神经元的运动后活动中寻找这种机制,FEF 是前额叶皮层中的一个运动视觉区域。猴子执行眼球运动赌博任务,在该任务中,它们将眼球运动到与动态变化的奖励结果相关的不同位置。行为数据表明,猴子会跟踪奖励历史,并根据自己的风险偏好做出选择。与之前的研究一致,我们观察到 FEF 神经元的活动与目标出现前不同眼球运动的预期奖励值相关。此外,我们观察到 FEF 神经元的活动在运动完成后、当目标不再在神经元反应场范围内时,仍然可以很好地表示眼球运动的方向、预期的奖励值及其相互作用。此外,这种运动后的信息也可以在局部场电位中观察到,特别是在低频带中。这些结果表明,先前行为和预期奖励值的神经信号在这些行为与其体验结果之间的延迟中持续存在。这些记忆痕迹可能在基于奖励的学习中发挥作用,在这种学习中,受试者需要学习预测延迟奖励的行为。