Motivation, Brain and Behavior Laboratory, Neuroimaging Research Center, Brain and Spine Institute, INSERM U975, CNRS UMR 7225, UPMC-P6 UMR S 1127, 7561 Paris Cedex 13, France.
Motivation, Brain and Behavior Laboratory, Neuroimaging Research Center, Brain and Spine Institute, INSERM U975, CNRS UMR 7225, UPMC-P6 UMR S 1127, 7561 Paris Cedex 13, France, Laboratoire de Neurosciences Cognitives, INSERM U960, and Département d'Etudes Cognitives, Ecole Normale Supérieure, 7505, Paris, France.
J Neurosci. 2014 Nov 19;34(47):15621-30. doi: 10.1523/JNEUROSCI.1350-14.2014.
The mechanisms of reward maximization have been extensively studied at both the computational and neural levels. By contrast, little is known about how the brain learns to choose the options that minimize action cost. In principle, the brain could have evolved a general mechanism that applies the same learning rule to the different dimensions of choice options. To test this hypothesis, we scanned healthy human volunteers while they performed a probabilistic instrumental learning task that varied in both the physical effort and the monetary outcome associated with choice options. Behavioral data showed that the same computational rule, using prediction errors to update expectations, could account for both reward maximization and effort minimization. However, these learning-related variables were encoded in partially dissociable brain areas. In line with previous findings, the ventromedial prefrontal cortex was found to positively represent expected and actual rewards, regardless of effort. A separate network, encompassing the anterior insula, the dorsal anterior cingulate, and the posterior parietal cortex, correlated positively with expected and actual efforts. These findings suggest that the same computational rule is applied by distinct brain systems, depending on the choice dimension-cost or benefit-that has to be learned.
奖励最大化的机制在计算和神经水平上都得到了广泛的研究。相比之下,大脑如何学会选择最小化行动成本的选项却知之甚少。原则上,大脑可能已经进化出一种通用机制,将相同的学习规则应用于选择选项的不同维度。为了检验这一假设,我们在健康的人类志愿者执行概率性工具学习任务时对他们进行了扫描,该任务在物理努力和与选择选项相关的货币回报方面都有所不同。行为数据表明,使用预测误差来更新预期的相同计算规则可以解释奖励最大化和努力最小化。然而,这些与学习相关的变量在部分分离的大脑区域中被编码。与之前的发现一致,腹内侧前额叶皮层被发现无论努力程度如何,都能积极地表示预期和实际的奖励。一个包含前岛叶、背侧前扣带皮层和后顶叶皮层的分离网络与预期和实际的努力呈正相关。这些发现表明,取决于要学习的选择维度——成本还是收益——不同的大脑系统会应用相同的计算规则。