操作性匹配是基于奖励与神经活动之间的协方差的突触可塑性的一般结果。

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

作者信息

Loewenstein Yonatan, Seung H Sebastian

机构信息

Howard Hughes Medical Institute and the Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

出版信息

Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.

DOI:10.1073/pnas.0505220103

PMID:17008410

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1622804/

Abstract

The probability of choosing an alternative in a long sequence of repeated choices is proportional to the total reward derived from that alternative, a phenomenon known as Herrnstein's matching law. This behavior is remarkably conserved across species and experimental conditions, but its underlying neural mechanisms still are unknown. Here, we propose a neural explanation of this empirical law of behavior. We hypothesize that there are forms of synaptic plasticity driven by the covariance between reward and neural activity and prove mathematically that matching is a generic outcome of such plasticity. Two hypothetical types of synaptic plasticity, embedded in decision-making neural network models, are shown to yield matching behavior in numerical simulations, in accord with our general theorem. We show how this class of models can be tested experimentally by making reward not only contingent on the choices of the subject but also directly contingent on fluctuations in neural activity. Maximization is shown to be a generic outcome of synaptic plasticity driven by the sum of the covariances between reward and all past neural activities.

摘要

在一系列重复选择中，选择某一选项的概率与该选项所带来的总奖励成正比，这一现象被称为赫尔斯坦匹配定律。这种行为在物种和实验条件中都显著保守，但其潜在的神经机制仍然未知。在此，我们提出了对这一行为经验法则的神经学解释。我们假设存在由奖励与神经活动之间的协方差驱动的突触可塑性形式，并通过数学证明匹配是这种可塑性的一般结果。嵌入决策神经网络模型中的两种假设类型的突触可塑性在数值模拟中显示出产生匹配行为，这与我们的一般定理一致。我们展示了如何通过使奖励不仅取决于主体的选择，还直接取决于神经活动的波动来对这类模型进行实验测试。最大化被证明是由奖励与所有过去神经活动之间的协方差之和驱动的突触可塑性的一般结果。

相似文献

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.

Statistical mechanics of reward-modulated learning in decision-making networks.

Neural Comput. 2012 May;24(5):1230-70. doi: 10.1162/NECO_a_00264. Epub 2012 Feb 1.

Robustness of learning that is based on covariance-driven synaptic plasticity.

PLoS Comput Biol. 2008 Mar 7;4(3):e1000007. doi: 10.1371/journal.pcbi.1000007.

Operant matching as a Nash equilibrium of an intertemporal game.

Neural Comput. 2009 Oct;21(10):2755-73. doi: 10.1162/neco.2009.09-08-854.

A spiking neural model for stable reinforcement of synapses based on multiple distal rewards.

Neural Comput. 2013 Jan;25(1):123-56. doi: 10.1162/NECO_a_00387. Epub 2012 Sep 28.

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.

Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

Covariance-based synaptic plasticity in an attractor network model accounts for fast adaptation in free operant learning.

J Neurosci. 2013 Jan 23;33(4):1521-34. doi: 10.1523/JNEUROSCI.2068-12.2013.

Supervised learning through neuronal response modulation.

Neural Comput. 2005 Mar;17(3):609-31. doi: 10.1162/0899766053019980.

Dimensional reduction for reward-based learning.

Network. 2006 Sep;17(3):235-52. doi: 10.1080/09548980600773215.

引用本文的文献

Model-based inference of synaptic plasticity rules.

Adv Neural Inf Process Syst. 2024;37:48519-48540.

Brain-inspired learning rules for spiking neural network-based control: a tutorial.

Biomed Eng Lett. 2024 Dec 2;15(1):37-55. doi: 10.1007/s13534-024-00436-6. eCollection 2025 Jan.

Brain mechanism of foraging: Reward-dependent synaptic plasticity versus neural integration of values.

Proc Natl Acad Sci U S A. 2024 Apr 2;121(14):e2318521121. doi: 10.1073/pnas.2318521121. Epub 2024 Mar 29.

Matching provides efficient decisions.

bioRxiv. 2024 Feb 19:2024.02.15.580481. doi: 10.1101/2024.02.15.580481.

Reward expectations direct learning and drive operant matching in .

Proc Natl Acad Sci U S A. 2023 Sep 26;120(39):e2221415120. doi: 10.1073/pnas.2221415120. Epub 2023 Sep 21.

Neural spiking for causal inference and learning.

PLoS Comput Biol. 2023 Apr 4;19(4):e1011005. doi: 10.1371/journal.pcbi.1011005. eCollection 2023 Apr.

Undermatching Is a Consequence of Policy Compression.

J Neurosci. 2023 Jan 18;43(3):447-457. doi: 10.1523/JNEUROSCI.1003-22.2022. Epub 2022 Dec 6.

Adaptive control of synaptic plasticity integrates micro- and macroscopic network function.

Neuropsychopharmacology. 2023 Jan;48(1):121-144. doi: 10.1038/s41386-022-01374-6. Epub 2022 Aug 29.

The dopamine circuit as a reward-taxis navigation system.

PLoS Comput Biol. 2022 Jul 25;18(7):e1010340. doi: 10.1371/journal.pcbi.1010340. eCollection 2022 Jul.

Computational mechanisms of distributed value representations and mixed learning strategies.

Nat Commun. 2021 Dec 10;12(1):7191. doi: 10.1038/s41467-021-27413-2.

本文引用的文献

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

Linear-Nonlinear-Poisson models of primate choice dynamics.

J Exp Anal Behav. 2005 Nov;84(3):581-617. doi: 10.1901/jeab.2005.23-05.

Dynamic response-by-response models of matching behavior in rhesus monkeys.

J Exp Anal Behav. 2005 Nov;84(3):555-79. doi: 10.1901/jeab.2005.110-04.

Midbrain dopamine neurons encode a quantitative reward prediction error signal.

Neuron. 2005 Jul 7;47(1):129-41. doi: 10.1016/j.neuron.2005.05.020.

Indeterminacy in brain and behavior.

Annu Rev Psychol. 2005;56:25-56. doi: 10.1146/annurev.psych.55.090902.141429.

Activity in posterior parietal cortex is correlated with the relative subjective desirability of action.

Neuron. 2004 Oct 14;44(2):365-78. doi: 10.1016/j.neuron.2004.09.009.

Coincident but distinct messages of midbrain dopamine and striatal tonically active neurons.

Neuron. 2004 Jul 8;43(1):133-43. doi: 10.1016/j.neuron.2004.06.012.

Matching behavior and the representation of value in the parietal cortex.

Science. 2004 Jun 18;304(5678):1782-7. doi: 10.1126/science.1094765.

Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology.

Curr Opin Neurobiol. 2004 Apr;14(2):139-47. doi: 10.1016/j.conb.2004.03.017.

Prefrontal cortex and decision making in a mixed-strategy game.

Nat Neurosci. 2004 Apr;7(4):404-10. doi: 10.1038/nn1209. Epub 2004 Mar 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

操作性匹配是基于奖励与神经活动之间的协方差的突触可塑性的一般结果。

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

作者信息

Loewenstein Yonatan, Seung H Sebastian

机构信息

Howard Hughes Medical Institute and the Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

出版信息

Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.

DOI:10.1073/pnas.0505220103

PMID:17008410

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1622804/

Abstract

摘要

操作性匹配是基于奖励与神经活动之间的协方差的突触可塑性的一般结果。

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

操作性匹配是基于奖励与神经活动之间的协方差的突触可塑性的一般结果。

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

作者信息

机构信息

出版信息