Institute of Neuroscience & Psychology, University of Glasgow, Glasgow, United Kingdom.
Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom.
Hum Brain Mapp. 2018 Jul;39(7):2887-2906. doi: 10.1002/hbm.24047. Epub 2018 Mar 25.
Learning occurs when an outcome differs from expectations, generating a reward prediction error signal (RPE). The RPE signal has been hypothesized to simultaneously embody the valence of an outcome (better or worse than expected) and its surprise (how far from expectations). Nonetheless, growing evidence suggests that separate representations of the two RPE components exist in the human brain. Meta-analyses provide an opportunity to test this hypothesis and directly probe the extent to which the valence and surprise of the error signal are encoded in separate or overlapping networks. We carried out several meta-analyses on a large set of fMRI studies investigating the neural basis of RPE, locked at decision outcome. We identified two valence learning systems by pooling studies searching for differential neural activity in response to categorical positive-versus-negative outcomes. The first valence network (negative > positive) involved areas regulating alertness and switching behaviours such as the midcingulate cortex, the thalamus and the dorsolateral prefrontal cortex whereas the second valence network (positive > negative) encompassed regions of the human reward circuitry such as the ventral striatum and the ventromedial prefrontal cortex. We also found evidence of a largely distinct surprise-encoding network including the anterior cingulate cortex, anterior insula and dorsal striatum. Together with recent animal and electrophysiological evidence this meta-analysis points to a sequential and distributed encoding of different components of the RPE signal, with potentially distinct functional roles.
当结果与预期不同时,学习就会发生,从而产生奖励预测误差信号 (RPE)。该 RPE 信号被假设同时包含结果的效价(好于或差于预期)及其惊讶程度(与预期相差多远)。然而,越来越多的证据表明,人类大脑中存在两种 RPE 成分的单独表示。元分析提供了一个检验该假设并直接探究误差信号的效价和惊讶程度在单独或重叠网络中编码程度的机会。我们对大量研究 RPE 的 fMRI 研究进行了几项元分析,这些研究的重点是在决策结果上锁定的 RPE 神经基础。我们通过汇集研究来识别两种价值学习系统,这些研究旨在寻找对类别性正-负结果的神经活动差异。第一个价值网络(负>正)涉及调节警觉性和切换行为的区域,如中央旁小叶、丘脑和背外侧前额叶皮层,而第二个价值网络(正>负)则包含人类奖励回路的区域,如腹侧纹状体和腹内侧前额叶皮层。我们还发现了一个主要的独立编码惊讶的网络的证据,包括前扣带皮层、前岛叶和背侧纹状体。结合最近的动物和电生理学证据,这项元分析表明,RPE 信号的不同成分以顺序和分布式的方式进行编码,具有潜在的不同功能作用。