Institute of Neuroscience, Key Laboratory of Primate Neurobiology, Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China.
University of Chinese Academy of Sciences, Beijing 100049, China.
Proc Natl Acad Sci U S A. 2020 Dec 1;117(48):30728-30737. doi: 10.1073/pnas.2019077117. Epub 2020 Nov 16.
A key step of decision making is to determine the value associated with each option. The evaluation process often depends on the accumulation of evidence from multiple sources, which may arrive at different times. How evidence is accumulated for value computation in the brain during decision making has not been well studied. To address this problem, we trained rhesus monkeys to perform a decision-making task in which they had to make eye movement choices between two targets, whose reward probabilities had to be determined with the combined evidence from four sequentially presented visual stimuli. We studied the encoding of the reward probabilities associated with the stimuli and the eye movements in the orbitofrontal (OFC) and the dorsolateral prefrontal (DLPFC) cortices during the decision process. We found that the OFC neurons encoded the reward probability associated with individual pieces of evidence in the stimulus domain. Importantly, the representation of the reward probability in the OFC was transient, and the OFC did not encode the reward probability associated with the combined evidence from multiple stimuli. The computation of the combined reward probabilities was observed only in the DLPFC and only in the action domain. Furthermore, the reward probability encoding in the DLPFC exhibited an asymmetric pattern of mixed selectivity that supported the computation of the stimulus-to-action transition of reward information. Our results reveal that the OFC and the DLPFC play distinct roles in the value computation during evidence accumulation.
决策的关键步骤是确定每个选项所关联的价值。评估过程通常取决于从多个来源积累的证据,而这些证据可能会在不同的时间到达。在决策过程中,大脑如何积累价值计算所需的证据尚未得到很好的研究。为了解决这个问题,我们训练恒河猴执行一项决策任务,在该任务中,它们必须在两个目标之间进行眼动选择,而这两个目标的奖励概率必须通过四个依次呈现的视觉刺激的综合证据来确定。我们研究了在决策过程中眶额皮层(OFC)和背外侧前额叶皮层(DLPFC)中与刺激和眼动相关的奖励概率的编码。我们发现,OFC 神经元在刺激域中编码与单个证据相关的奖励概率。重要的是,OFC 中奖励概率的表示是短暂的,并且 OFC 不编码来自多个刺激的综合证据相关的奖励概率。仅在 DLPFC 且仅在动作域中观察到对综合奖励概率的计算。此外,DLPFC 中的奖励概率编码表现出混合选择性的不对称模式,支持奖励信息从刺激到动作的转换计算。我们的研究结果表明,在证据积累过程中,OFC 和 DLPFC 发挥着不同的作用。