Smith Andrew, Li Ming, Becker Sue, Kapur Shitij
Department of Psychology, Neuroscience and Behaviour, McMaster University, Hamilton, Ontario, Canada.
Network. 2006 Mar;17(1):61-84. doi: 10.1080/09548980500361624.
The notion of prediction error has established itself at the heart of formal models of animal learning and current hypotheses of dopamine function. Several interpretations of prediction error have been offered, including the model-free reinforcement learning method known as temporal difference learning (TD), and the important Rescorla-Wagner (RW) learning rule. Here, we present a model-based adaptation of these ideas that provides a good account of empirical data pertaining to dopamine neuron firing patterns and associative learning paradigms such as latent inhibition, Kamin blocking and overshadowing. Our departure from model-free reinforcement learning also offers: 1) a parsimonious distinction between tonic and phasic dopamine functions; 2) a potential generalization of the role of phasic dopamine from valence-dependent "reward" processing to valence-independent "salience" processing; 3) an explanation for the selectivity of certain dopamine manipulations on motivation for distal rewards; and 4) a plausible link between formal notions of prediction error and accounts of disturbances of thought in schizophrenia (in which dopamine dysfunction is strongly implicated). The model distinguishes itself from existing accounts by offering novel predictions pertaining to the firing of dopamine neurons in various untested behavioral scenarios.
预测误差的概念已成为动物学习形式模型和当前多巴胺功能假说的核心。人们提出了几种对预测误差的解释,包括被称为时间差分学习(TD)的无模型强化学习方法以及重要的雷斯克拉 - 瓦格纳(RW)学习规则。在此,我们提出了这些观点基于模型的一种改编,它很好地解释了与多巴胺神经元放电模式以及诸如潜伏抑制、卡明阻断和遮蔽等联想学习范式相关的实证数据。我们与无模型强化学习的不同之处还在于:1)对紧张性和相位性多巴胺功能进行了简洁的区分;2)将相位性多巴胺的作用从依赖效价的“奖励”处理潜在地推广到不依赖效价的“显著性”处理;3)解释了某些多巴胺操纵对远期奖励动机的选择性;4)在预测误差的形式概念与精神分裂症思维障碍(其中多巴胺功能障碍有很大关联)的解释之间建立了合理的联系。该模型通过提供与各种未测试行为场景中多巴胺神经元放电有关的新颖预测,使其有别于现有解释。