超越简单的强化学习：奖励学习和估值的计算神经生物学。

Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation.

出版信息

Eur J Neurosci. 2012 Apr;35(7):987-90. doi: 10.1111/j.1460-9568.2012.08074.x.

DOI:10.1111/j.1460-9568.2012.08074.x

Abstract

Neural computational accounts of reward-learning have been dominated by the hypothesis that dopamine neurons behave like a reward-prediction error and thus facilitate reinforcement learning in striatal target neurons. While this framework is consistent with a lot of behavioral and neural evidence, this theory fails to account for a number of behavioral and neurobiological observations. In this special issue of EJN we feature a combination of theoretical and experimental papers highlighting some of the explanatory challenges faced by simple reinforcement-learning models and describing some of the ways in which the framework is being extended in order to address these challenges.

摘要

神经计算奖赏学习的解释一直以多巴胺神经元的行为就像一个奖励预测误差，并因此促进纹状体目标神经元的强化学习为假设为主导。虽然这个框架与大量的行为和神经学证据是一致的，但这个理论无法解释许多行为和神经生物学的观察结果。在本期 EJN 特刊中，我们结合了理论和实验论文，突出了简单强化学习模型所面临的一些解释性挑战，并描述了一些扩展该框架的方法，以解决这些挑战。

相似文献

Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation.超越简单的强化学习：奖励学习和估值的计算神经生物学。

Eur J Neurosci. 2012 Apr;35(7):987-90. doi: 10.1111/j.1460-9568.2012.08074.x.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.我们如何学习做决策：强化学习预测错误在人类中的快速传播。

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Deep and beautiful. The reward prediction error hypothesis of dopamine.深刻而美妙。多巴胺的奖励预测误差假说。

Stud Hist Philos Biol Biomed Sci. 2014 Mar;45:57-67. doi: 10.1016/j.shpsc.2013.10.006. Epub 2013 Nov 16.

Deficient reinforcement learning in medial frontal cortex as a model of dopamine-related motivational deficits in ADHD.中前额叶皮层的强化学习不足可作为 ADHD 中与多巴胺相关的动机缺陷的模型。

Neural Netw. 2013 Oct;46:199-209. doi: 10.1016/j.neunet.2013.05.008. Epub 2013 May 21.

Can the apparent adaptation of dopamine neurons' mismatch sensitivities be reconciled with their computation of reward prediction errors?多巴胺神经元的失配敏感性的明显适应性能否与它们对奖励预测误差的计算相协调？

Neurosci Lett. 2008 Jun 13;438(1):14-6. doi: 10.1016/j.neulet.2008.04.059. Epub 2008 Apr 22.

The computational neurobiology of learning and reward.学习与奖励的计算神经生物学

Curr Opin Neurobiol. 2006 Apr;16(2):199-204. doi: 10.1016/j.conb.2006.03.006. Epub 2006 Mar 24.

The role of learning-related dopamine signals in addiction vulnerability.与学习相关的多巴胺信号在成瘾易感性中的作用。

Prog Brain Res. 2014;211:31-77. doi: 10.1016/B978-0-444-63425-2.00003-9.

[Reward processing of the basal ganglia--reward function of pedunculopontine tegmental nucleus].[基底神经节的奖赏处理——脚桥被盖核的奖赏功能]

Brain Nerve. 2009 Apr;61(4):397-404.

Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior.基底神经节和眶额皮质在目标导向行为中的参与。

Prog Brain Res. 2000;126:193-215. doi: 10.1016/S0079-6123(00)26015-9.

Neural control of dopamine neurotransmission: implications for reinforcement learning.神经控制多巴胺递质传递：对强化学习的启示。

Eur J Neurosci. 2012 Apr;35(7):1115-23. doi: 10.1111/j.1460-9568.2012.08055.x.

引用本文的文献

The Situated Assessment Method (SAM2): Establishing individual differences in habitual behavior.情境评估法（SAM2）：揭示习惯性行为中的个体差异。

PLoS One. 2023 Jun 22;18(6):e0286954. doi: 10.1371/journal.pone.0286954. eCollection 2023.

Association Between Aggression and Differential Functional Activity of Neural Regions Implicated in Retaliation.攻击行为与报复行为相关的神经区域功能活动差异之间的关联。

J Am Acad Child Adolesc Psychiatry. 2023 Jul;62(7):805-815. doi: 10.1016/j.jaac.2023.01.021. Epub 2023 Mar 6.

Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T.状态和动作关联或区分泛化的强化学习：3T 和 7T 的 fMRI。

Hum Brain Mapp. 2022 Oct 15;43(15):4750-4790. doi: 10.1002/hbm.25988. Epub 2022 Jul 21.

Challenges and Opportunities for Grounding Cognition.扎根认知的挑战与机遇

J Cogn. 2020 Sep 29;3(1):31. doi: 10.5334/joc.116.

Specific cortical and subcortical alterations for reactive and proactive aggression in children and adolescents with disruptive behavior.儿童和青少年破坏性行为中反应性和主动性攻击的特定皮质和皮质下改变。

Neuroimage Clin. 2020;27:102344. doi: 10.1016/j.nicl.2020.102344. Epub 2020 Jul 11.

Reminiscing about positive memories buffers acute stress responses.回忆积极的记忆可以缓冲急性应激反应。

Nat Hum Behav. 2017 May;1(5). doi: 10.1038/s41562-017-0093. Epub 2017 Apr 24.

Active Confirmation Bias in the Evaluative Processing of Food Images.食物图片评价加工中的主动确认偏误。

Sci Rep. 2018 Nov 15;8(1):16864. doi: 10.1038/s41598-018-35179-9.

Traits of empathy and anger: implications for psychopathy and other disorders associated with aggression.同理心和愤怒的特征：对精神病态和其他与攻击行为相关的障碍的影响。

Philos Trans R Soc Lond B Biol Sci. 2018 Apr 19;373(1744). doi: 10.1098/rstb.2017.0155.

Emotion-based learning systems and the development of morality.基于情感的学习系统与道德发展

Cognition. 2017 Oct;167:38-45. doi: 10.1016/j.cognition.2017.03.013. Epub 2017 Apr 7.

Dynamic Interaction between Reinforcement Learning and Attention in Multidimensional Environments.多维环境中强化学习与注意力之间的动态交互

Neuron. 2017 Jan 18;93(2):451-463. doi: 10.1016/j.neuron.2016.12.040.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

超越简单的强化学习：奖励学习和估值的计算神经生物学。

Beyond simple reinforcement learning: the computational neurobiology of reward-learning and valuation.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献