Eur J Neurosci. 2012 Apr;35(7):987-90. doi: 10.1111/j.1460-9568.2012.08074.x.
Neural computational accounts of reward-learning have been dominated by the hypothesis that dopamine neurons behave like a reward-prediction error and thus facilitate reinforcement learning in striatal target neurons. While this framework is consistent with a lot of behavioral and neural evidence, this theory fails to account for a number of behavioral and neurobiological observations. In this special issue of EJN we feature a combination of theoretical and experimental papers highlighting some of the explanatory challenges faced by simple reinforcement-learning models and describing some of the ways in which the framework is being extended in order to address these challenges.
神经计算奖赏学习的解释一直以多巴胺神经元的行为就像一个奖励预测误差,并因此促进纹状体目标神经元的强化学习为假设为主导。虽然这个框架与大量的行为和神经学证据是一致的,但这个理论无法解释许多行为和神经生物学的观察结果。在本期 EJN 特刊中,我们结合了理论和实验论文,突出了简单强化学习模型所面临的一些解释性挑战,并描述了一些扩展该框架的方法,以解决这些挑战。