Suppr超能文献

强化学习的计算模型:多巴胺作为奖励信号的作用。

Computational models of reinforcement learning: the role of dopamine as a reward signal.

出版信息

Cogn Neurodyn. 2010 Jun;4(2):91-105. doi: 10.1007/s11571-010-9109-x. Epub 2010 Mar 21.

Abstract

Reinforcement learning is ubiquitous. Unlike other forms of learning, it involves the processing of fast yet content-poor feedback information to correct assumptions about the nature of a task or of a set of stimuli. This feedback information is often delivered as generic rewards or punishments, and has little to do with the stimulus features to be learned. How can such low-content feedback lead to such an efficient learning paradigm? Through a review of existing neuro-computational models of reinforcement learning, we suggest that the efficiency of this type of learning resides in the dynamic and synergistic cooperation of brain systems that use different levels of computations. The implementation of reward signals at the synaptic, cellular, network and system levels give the organism the necessary robustness, adaptability and processing speed required for evolutionary and behavioral success.

摘要

强化学习无处不在。与其他学习形式不同,它涉及快速但内容贫乏的反馈信息的处理,以纠正对任务性质或一组刺激的假设。这种反馈信息通常作为通用奖励或惩罚提供,与要学习的刺激特征几乎没有关系。如此低信息量的反馈怎么能导致如此高效的学习范例呢?通过对现有的强化学习神经计算模型的回顾,我们认为这种学习类型的效率在于使用不同计算水平的大脑系统的动态和协同合作。在突触、细胞、网络和系统水平上实现奖励信号,为生物体提供了进化和行为成功所需的必要鲁棒性、适应性和处理速度。

相似文献

3
Predictive reward signal of dopamine neurons.多巴胺神经元的预测性奖励信号。
J Neurophysiol. 1998 Jul;80(1):1-27. doi: 10.1152/jn.1998.80.1.1.

引用本文的文献

2
Reinforcement learning processes as forecasters of depression remission.强化学习过程可预测抑郁缓解。
J Affect Disord. 2025 Jan 1;368:829-837. doi: 10.1016/j.jad.2024.09.066. Epub 2024 Sep 11.

本文引用的文献

1
Functional heterogeneity at dopamine release sites.多巴胺释放位点的功能异质性。
J Neurosci. 2009 Nov 18;29(46):14670-80. doi: 10.1523/JNEUROSCI.1349-09.2009.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验