基于协方差的吸引子网络模型中的突触可塑性解释了自由操作学习中的快速适应。

Covariance-based synaptic plasticity in an attractor network model accounts for fast adaptation in free operant learning.

机构信息

Department of Neurobiology, Alexander Silberman Institute of Life Sciences, Interdisciplinary Center for Neural Computation, Edmond and Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem 91904, Israel.

出版信息

J Neurosci. 2013 Jan 23;33(4):1521-34. doi: 10.1523/JNEUROSCI.2068-12.2013.

DOI:10.1523/JNEUROSCI.2068-12.2013

PMID:23345226

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6618748/

Abstract

In free operant experiments, subjects alternate at will between targets that yield rewards stochastically. Behavior in these experiments is typically characterized by (1) an exponential distribution of stay durations, (2) matching of the relative time spent at a target to its relative share of the total number of rewards, and (3) adaptation after a change in the reward rates that can be very fast. The neural mechanism underlying these regularities is largely unknown. Moreover, current decision-making neural network models typically aim at explaining behavior in discrete-time experiments in which a single decision is made once in every trial, making these models hard to extend to the more natural case of free operant decisions. Here we show that a model based on attractor dynamics, in which transitions are induced by noise and preference is formed via covariance-based synaptic plasticity, can account for the characteristics of behavior in free operant experiments. We compare a specific instance of such a model, in which two recurrently excited populations of neurons compete for higher activity, to the behavior of rats responding on two levers for rewarding brain stimulation on a concurrent variable interval reward schedule (Gallistel et al., 2001). We show that the model is consistent with the rats' behavior, and in particular, with the observed fast adaptation to matching behavior. Further, we show that the neural model can be reduced to a behavioral model, and we use this model to deduce a novel "conservation law," which is consistent with the behavior of the rats.

摘要

在自由操作实验中，被试可以随意在随机产生奖励的目标之间进行交替。这些实验中的行为通常具有以下特征：（1）停留时间呈指数分布，（2）目标上花费的相对时间与其在总奖励数中的相对份额相匹配，以及（3）在奖励率变化后的快速适应。这些规律的神经机制在很大程度上是未知的。此外，当前的决策神经网络模型通常旨在解释离散时间实验中的行为，在离散时间实验中，每一次试验只做出一次决策，这使得这些模型难以扩展到更自然的自由操作决策情况。在这里，我们表明，基于吸引子动力学的模型，其中通过噪声诱导转换，并且通过基于协方差的突触可塑性形成偏好，可以解释自由操作实验中的行为特征。我们将这种模型的一个特定实例与大鼠的行为进行了比较，大鼠在同时进行的可变间隔奖励计划中，通过两个杠杆对奖励性脑刺激做出反应（Gallistel 等人，2001）。我们表明，该模型与大鼠的行为一致，特别是与观察到的快速适应匹配行为一致。此外，我们表明，神经模型可以简化为行为模型，并且我们使用该模型推导出一个新的“守恒定律”，该定律与大鼠的行为一致。

相似文献

Covariance-based synaptic plasticity in an attractor network model accounts for fast adaptation in free operant learning.基于协方差的吸引子网络模型中的突触可塑性解释了自由操作学习中的快速适应。

J Neurosci. 2013 Jan 23;33(4):1521-34. doi: 10.1523/JNEUROSCI.2068-12.2013.

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.操作性匹配是基于奖励与神经活动之间的协方差的突触可塑性的一般结果。

Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.

Robustness of learning that is based on covariance-driven synaptic plasticity.基于协方差驱动突触可塑性的学习的稳健性。

PLoS Comput Biol. 2008 Mar 7;4(3):e1000007. doi: 10.1371/journal.pcbi.1000007.

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.基于生物物理学的匹配律行为神经模型：随机突触导致的改善。

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates.神经放电率操作性条件反射过程中奖励与无监督学习的共存

PLoS One. 2014 Jan 27;9(1):e87123. doi: 10.1371/journal.pone.0087123. eCollection 2014.

Shaping embodied neural networks for adaptive goal-directed behavior.塑造用于适应性目标导向行为的具身神经网络。

PLoS Comput Biol. 2008 Mar 28;4(3):e1000042. doi: 10.1371/journal.pcbi.1000042.

Specific Plasticity Loci and Their Synergism Mediate Operant Conditioning.特定的可塑性基因座及其协同作用介导操作性条件反射。

J Neurosci. 2022 Feb 16;42(7):1211-1223. doi: 10.1523/JNEUROSCI.1722-21.2021. Epub 2022 Jan 6.

A spiking neural model for stable reinforcement of synapses based on multiple distal rewards.基于多个远距离奖励的突触稳定强化的尖峰神经网络模型。

Neural Comput. 2013 Jan;25(1):123-56. doi: 10.1162/NECO_a_00387. Epub 2012 Sep 28.

Computational model of the distributed representation of operant reward memory: combinatoric engagement of intrinsic and synaptic plasticity mechanisms.操作性奖励记忆分布式表示的计算模型：内在和突触可塑性机制的组合参与。

Learn Mem. 2020 May 15;27(6):236-249. doi: 10.1101/lm.051367.120. Print 2020 Jun.

Bidirectional Modulation of Intrinsic Excitability in Rat Prelimbic Cortex Neuronal Ensembles and Non-Ensembles after Operant Learning.操作性学习后大鼠前边缘皮层神经元集群和非集群内在兴奋性的双向调节

J Neurosci. 2017 Sep 6;37(36):8845-8856. doi: 10.1523/JNEUROSCI.3761-16.2017. Epub 2017 Aug 4.

引用本文的文献

Fast adaptation to rule switching using neuronal surprise.利用神经元惊讶实现快速规则切换适应。

PLoS Comput Biol. 2024 Feb 20;20(2):e1011839. doi: 10.1371/journal.pcbi.1011839. eCollection 2024 Feb.

Deviation from the matching law reflects an optimal strategy involving learning over multiple timescales.偏离匹配律反映了一种涉及多个时间尺度的学习的最优策略。

Nat Commun. 2019 Apr 1;10(1):1466. doi: 10.1038/s41467-019-09388-3.

Striatal action-value neurons reconsidered.重新思考纹状体动作价值神经元。

Elife. 2018 May 31;7:e34248. doi: 10.7554/eLife.34248.

Adaptive learning and decision-making under uncertainty by metaplastic synapses guided by a surprise detection system.由惊喜检测系统引导的元塑性突触在不确定性下的适应性学习与决策。

Elife. 2016 Aug 9;5:e18073. doi: 10.7554/eLife.18073.

Spatial generalization in operant learning: lessons from professional basketball.操作性学习中的空间泛化：来自职业篮球的经验教训。

PLoS Comput Biol. 2014 May 22;10(5):e1003623. doi: 10.1371/journal.pcbi.1003623. eCollection 2014 May.

Vague-to-crisp dynamics of percept formation modeled as operant (selectionist) process.模糊到清晰的感知形成动力学模型化为操作性（选择论）过程。

Cogn Neurodyn. 2014 Feb;8(1):71-80. doi: 10.1007/s11571-013-9262-0. Epub 2013 Aug 4.

Dynamical regimes in neural network models of matching behavior.匹配行为的神经网络模型中的动力学状态。

Neural Comput. 2013 Dec;25(12):3093-112. doi: 10.1162/NECO_a_00522. Epub 2013 Sep 18.

Stochasticity, bistability and the wisdom of crowds: a model for associative learning in genetic regulatory networks.随机波动性、双稳性和群体智慧：遗传调控网络中联想学习的模型。

PLoS Comput Biol. 2013;9(8):e1003179. doi: 10.1371/journal.pcbi.1003179. Epub 2013 Aug 22.

A multistep general theory of transition to addiction.多步骤一般成瘾理论。

Psychopharmacology (Berl). 2013 Oct;229(3):387-413. doi: 10.1007/s00213-013-3224-4. Epub 2013 Aug 21.

本文引用的文献

Dynamics of time matching: Arousal makes better seem worse.时间匹配的动力学：唤醒使更好的看起来更差。

Psychon Bull Rev. 1995 Jun;2(2):208-15. doi: 10.3758/BF03210960.

Neural prediction errors reveal a risk-sensitive reinforcement-learning process in the human brain.神经预测误差揭示了人类大脑中风险敏感的强化学习过程。

J Neurosci. 2012 Jan 11;32(2):551-62. doi: 10.1523/JNEUROSCI.5498-10.2012.

Bayesian sampling in visual perception.贝叶斯抽样在视觉感知中的应用。

Proc Natl Acad Sci U S A. 2011 Jul 26;108(30):12491-6. doi: 10.1073/pnas.1101430108. Epub 2011 Jul 8.

Functional requirements for reward-modulated spike-timing-dependent plasticity.奖赏调节的尖峰时间依赖型可塑性的功能需求。

J Neurosci. 2010 Oct 6;30(40):13326-37. doi: 10.1523/JNEUROSCI.6249-09.2010.

Alternation rate in perceptual bistability is maximal at and symmetric around equi-dominance.感知双稳性中的交替率在等优势状态时最大，且关于等优势状态对称。

J Vis. 2010 Sep 1;10(11):1. doi: 10.1167/10.11.1.

Synaptic theory of replicator-like melioration.复制子类似改进的突触理论。

Front Comput Neurosci. 2010 Jun 17;4:17. doi: 10.3389/fncom.2010.00017. eCollection 2010.

A reward-modulated hebbian learning rule can explain experimentally observed network reorganization in a brain control task.奖励调节的赫布学习规则可以解释在大脑控制任务中观察到的网络重组的实验现象。

J Neurosci. 2010 Jun 23;30(25):8400-10. doi: 10.1523/JNEUROSCI.4284-09.2010.

Gain in sensitivity and loss in temporal contrast of STDP by dopaminergic modulation at hippocampal synapses.海马突触处多巴胺能调制对突触可塑性时间依赖性可塑性的敏感性增加和时间对比度降低。

Proc Natl Acad Sci U S A. 2009 Aug 4;106(31):13028-33. doi: 10.1073/pnas.0900546106. Epub 2009 Jul 20.

Reinforcement learning can account for associative and perceptual learning on a visual-decision task.强化学习可以解释视觉决策任务中的联想学习和感知学习。

Nat Neurosci. 2009 May;12(5):655-63. doi: 10.1038/nn.2304. Epub 2009 Apr 19.

Synaptic plasticity in the basal ganglia.基底神经节中的突触可塑性。

Behav Brain Res. 2009 Apr 12;199(1):119-28. doi: 10.1016/j.bbr.2008.10.030. Epub 2008 Nov 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验