基于基底神经节和多巴胺反馈的显著性门控工作记忆、动作选择与强化的动力学模型。

Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback.

作者信息

Ponzi Adam

机构信息

Laboratory for Dynamics of Emergent Intelligence, RIKEN Brain Science Institute, Wako, Saitama, Japan.

出版信息

Neural Netw. 2008 Mar-Apr;21(2-3):322-30. doi: 10.1016/j.neunet.2007.12.040. Epub 2007 Dec 31.

DOI:10.1016/j.neunet.2007.12.040

PMID:18280108

Abstract

A simple working memory model based on recurrent network activation is proposed and its application to selection and reinforcement of an action is demonstrated as a solution to the temporal credit assignment problem. Reactivation of recent salient cue states is generated and maintained as a type of salience gated recurrently active working memory, while lower salience distractors are ignored. Cue reactivation during the action selection period allows the cue to select an action while its reactivation at the reward period allows the reinforcement of the action selected by the reactivated state, which is necessarily the action which led to the reward being found. A down-gating of the external input during the reactivation and maintenance prevents interference. A double winner-take-all system which selects only one cue and only one action allows the targeting of the cue-action allocation to be modified. This targeting works both to reinforce a correct cue-action allocation and to punish the allocation when cue-action allocations change. Here we suggest a firing rate neural network implementation of this system based on the basal ganglia anatomy with input from a cortical association layer where reactivations are generated by signals from the thalamus. Striatum medium spiny neurons represent actions. Auto-catalytic feedback from a dopamine reward signal modulates three-way Hebbian long term potentiation and depression at the cortical-striatal synapses which represent the cue-action associations. The model is illustrated by the numerical simulations of a simple example--that of associating a cue signal to a correct action to obtain reward after a delay period, typical of primate cue reward tasks. Through learning, the model shows a transition from an exploratory phase where actions are generated randomly, to a stable directed phase where the animal always chooses the correct action for each experienced state. When cue-action allocations change, we show that this is noticed by the model, the incorrect cue-action allocations are punished and the correct ones discovered.

摘要

提出了一种基于循环网络激活的简单工作记忆模型，并展示了其在动作选择和强化中的应用，作为解决时间信用分配问题的一种方法。近期显著线索状态的重新激活被生成并维持为一种显著门控的循环激活工作记忆，而低显著性的干扰因素则被忽略。动作选择期间的线索重新激活允许线索选择一个动作，而奖励期间的重新激活则允许强化由重新激活状态选择的动作，该动作必然是导致获得奖励的动作。重新激活和维持期间外部输入的向下门控可防止干扰。一个仅选择一个线索和一个动作的双胜者全得系统允许修改线索-动作分配的目标。这种目标设定既有助于强化正确的线索-动作分配，也有助于在线索-动作分配发生变化时惩罚该分配。在此，我们基于基底神经节解剖结构提出了该系统的发放率神经网络实现，其输入来自皮层联合层，其中重新激活由来自丘脑的信号产生。纹状体中等棘状神经元代表动作。多巴胺奖励信号的自催化反馈调节皮层-纹状体突触处的三向赫布型长时程增强和抑制，这些突触代表线索-动作关联。通过一个简单示例的数值模拟来说明该模型——将线索信号与正确动作相关联，以便在延迟期后获得奖励，这是灵长类线索奖励任务的典型情况。通过学习，该模型显示出从随机生成动作的探索阶段到动物总是为每个经历状态选择正确动作的稳定定向阶段的转变。当线索-动作分配发生变化时，我们表明模型会注意到这一点，错误的线索-动作分配会受到惩罚，而正确的分配会被发现。

相似文献

Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback.基于基底神经节和多巴胺反馈的显著性门控工作记忆、动作选择与强化的动力学模型。

Neural Netw. 2008 Mar-Apr;21(2-3):322-30. doi: 10.1016/j.neunet.2007.12.040. Epub 2007 Dec 31.

Functional properties of the basal ganglia's re-entrant loop architecture: selection and reinforcement.基底神经节的折返环结构的功能特性：选择和强化。

Neuroscience. 2011 Dec 15;198:138-51. doi: 10.1016/j.neuroscience.2011.07.060. Epub 2011 Jul 29.

Banishing the homunculus: making working memory work.摒弃小人：让工作记忆发挥作用。

Neuroscience. 2006 Apr 28;139(1):105-18. doi: 10.1016/j.neuroscience.2005.04.067. Epub 2005 Dec 15.

How laminar frontal cortex and basal ganglia circuits interact to control planned and reactive saccades.层状额叶皮质和基底神经节回路如何相互作用以控制计划性和反射性扫视。

Neural Netw. 2004 May;17(4):471-510. doi: 10.1016/j.neunet.2003.08.006.

Short-term memory traces for action bias in human reinforcement learning.人类强化学习中动作偏差的短期记忆痕迹

Brain Res. 2007 Jun 11;1153:111-21. doi: 10.1016/j.brainres.2007.03.057. Epub 2007 Mar 24.

Goal-directed learning of features and forward models.特征和前向模型的目标导向学习。

Neural Netw. 2009 Jul-Aug;22(5-6):586-92. doi: 10.1016/j.neunet.2009.06.049. Epub 2009 Jul 8.

A dopaminergic basis for working memory, learning and attentional shifting in Parkinsonism.帕金森病中工作记忆、学习和注意力转移的多巴胺能基础。

Neuropsychologia. 2008 Nov;46(13):3144-56. doi: 10.1016/j.neuropsychologia.2008.07.011. Epub 2008 Jul 19.

Background-activity-dependent properties of a network model for working memory that incorporates cellular bistability.包含细胞双稳性的工作记忆网络模型的背景活动依赖特性。

Biol Cybern. 2005 Aug;93(2):109-18. doi: 10.1007/s00422-005-0543-5. Epub 2005 Apr 1.

Working memory and response selection: a computational account of interactions among cortico-basalganglio-thalamic loops.工作记忆和反应选择：皮质基底神经节丘脑回路相互作用的计算描述。

Neural Netw. 2012 Feb;26:59-74. doi: 10.1016/j.neunet.2011.10.008. Epub 2011 Oct 25.

A neurocomputational model of dopamine and prefrontal-striatal interactions during multicue category learning by Parkinson patients.帕金森病患者在多线索类别学习过程中多巴胺和前额叶-纹状体相互作用的神经计算模型。

J Cogn Neurosci. 2011 Jan;23(1):151-67. doi: 10.1162/jocn.2010.21420.

引用本文的文献

Imagery in the entropic associative memory.熵关联记忆中的意象。

Sci Rep. 2023 Jun 12;13(1):9553. doi: 10.1038/s41598-023-36761-6.

From Focused Thought to Reveries: A Memory System for a Conscious Robot.从专注思考到遐想：有意识机器人的记忆系统

Front Robot AI. 2018 Apr 4;5:29. doi: 10.3389/frobt.2018.00029. eCollection 2018.

The modeling and simulation of visuospatial working memory.视空间工作记忆的建模与模拟。

Cogn Neurodyn. 2010 Dec;4(4):359-66. doi: 10.1007/s11571-010-9129-6. Epub 2010 Aug 25.

Basal ganglia neurons dynamically facilitate exploration during associative learning.基底神经节神经元在联想学习过程中动态促进探索。

J Neurosci. 2011 Mar 30;31(13):4878-85. doi: 10.1523/JNEUROSCI.3658-10.2011.

Adaptation, expertise, and giftedness: towards an understanding of cortical, subcortical, and cerebellar network contributions.适应能力、专业知识和天赋：理解皮质、皮质下和小脑网络的贡献。

Cerebellum. 2010 Dec;9(4):499-529. doi: 10.1007/s12311-010-0192-7.

Striatal activity during intentional switching depends on pattern stability.纹状体在意图切换过程中的活动取决于模式稳定性。

J Neurosci. 2010 Mar 3;30(9):3167-74. doi: 10.1523/JNEUROSCI.2673-09.2010.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于基底神经节和多巴胺反馈的显著性门控工作记忆、动作选择与强化的动力学模型。

Dynamical model of salience gated working memory, action selection and reinforcement based on basal ganglia and dopamine feedback.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献