Institut de Neurosciences de la Timone, Aix Marseille Université, Unité Mixte de Recherche 7289 Centre National de la Recherche Scientifique, Marseille 13005, France
Institut de Neurosciences de la Timone, Aix Marseille Université, Unité Mixte de Recherche 7289 Centre National de la Recherche Scientifique, Marseille 13005, France.
J Neurosci. 2023 May 3;43(18):3339-3352. doi: 10.1523/JNEUROSCI.0952-22.2023. Epub 2023 Apr 4.
Reward prediction error (RPE) signals are crucial for reinforcement learning and decision-making as they quantify the mismatch between predicted and obtained rewards. RPE signals are encoded in the neural activity of multiple brain areas, such as midbrain dopaminergic neurons, prefrontal cortex, and striatum. However, it remains unclear how these signals are expressed through anatomically and functionally distinct subregions of the striatum. In the current study, we examined to which extent RPE signals are represented across different striatal regions. To do so, we recorded local field potentials (LFPs) in sensorimotor, associative, and limbic striatal territories of two male rhesus monkeys performing a free-choice probabilistic learning task. The trial-by-trial evolution of RPE during task performance was estimated using a reinforcement learning model fitted on monkeys' choice behavior. Overall, we found that changes in beta band oscillations (15-35 Hz), after the outcome of the animal's choice, are consistent with RPE encoding. Moreover, we provide evidence that the signals related to RPE are more strongly represented in the ventral (limbic) than dorsal (sensorimotor and associative) part of the striatum. To conclude, our results suggest a relationship between striatal beta oscillations and the evaluation of outcomes based on RPE signals and highlight a major contribution of the ventral striatum to the updating of learning processes. Reward prediction error (RPE) signals are crucial for reinforcement learning and decision-making as they quantify the mismatch between predicted and obtained rewards. Current models suggest that RPE signals are encoded in the neural activity of multiple brain areas, including the midbrain dopaminergic neurons, prefrontal cortex and striatum. However, it remains elusive whether RPEs recruit anatomically and functionally distinct subregions of the striatum. Our study provides evidence that RPE-related modulations in local field potential (LFP) power are dominant in the striatum. In particular, they are stronger in the rostro-ventral rather than the caudo-dorsal striatum. Our findings contribute to a better understanding of the role of striatal territories in reward-based learning and may be relevant for neuropsychiatric and neurologic diseases that affect striatal circuits.
奖励预测误差(RPE)信号对于强化学习和决策至关重要,因为它们量化了预测奖励与实际获得奖励之间的不匹配。RPE 信号编码在多个大脑区域的神经活动中,例如中脑多巴胺能神经元、前额叶皮层和纹状体。然而,这些信号如何通过纹状体的解剖和功能上不同的亚区来表达仍不清楚。在当前的研究中,我们研究了 RPE 信号在不同纹状区域的表达程度。为此,我们在两只雄性恒河猴执行自由选择概率学习任务时,记录了感觉运动、联合和边缘纹状体区域的局部场电位(LFP)。使用拟合猴子选择行为的强化学习模型来估计任务执行过程中 RPE 的逐试演变。总的来说,我们发现,动物选择结果后β波段振荡(15-35 Hz)的变化与 RPE 编码一致。此外,我们提供的证据表明,与 RPE 相关的信号在腹侧(边缘)纹状体比背侧(感觉运动和联合)纹状体部分更强地表示。总之,我们的结果表明纹状体β振荡与基于 RPE 信号的结果评估之间存在关系,并强调了腹侧纹状体对学习过程更新的主要贡献。