Laboratory for Integrated Theoretical Neuroscience, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.
Curr Opin Neurobiol. 2014 Apr;25:123-9. doi: 10.1016/j.conb.2014.01.001. Epub 2014 Jan 23.
A fundamental challenge for computational and cognitive neuroscience is to understand how reward-based learning and decision-making are made and how accrued knowledge and internal models of the environment are incorporated. Remarkable progress has been made in the field, guided by the midbrain dopamine reward prediction error hypothesis and the underlying reinforcement learning framework, which does not involve internal models ('model-free'). Recent studies, however, have begun not only to address more complex decision-making processes that are integrated with model-free decision-making, but also to include internal models about environmental reward structures and the minds of other agents, including model-based reinforcement learning and using generalized prediction errors. Even dopamine, a classic model-free signal, may work as multiplexed signals using model-based information and contribute to representational learning of reward structure.
计算神经科学和认知神经科学面临的一个基本挑战是要理解基于奖励的学习和决策是如何做出的,以及如何积累知识和环境的内部模型。在中脑多巴胺奖赏预测误差假说和基本强化学习框架的指导下,该领域已经取得了显著的进展,该假说和框架不涉及内部模型(“无模型”)。然而,最近的研究不仅开始解决与无模型决策相结合的更复杂的决策过程,而且还开始包括关于环境奖励结构和其他主体心理的内部模型,包括基于模型的强化学习和使用广义预测误差。即使是多巴胺,一种经典的无模型信号,也可能使用基于模型的信息作为复用信号,并有助于奖励结构的代表性学习。