基于生物物理学的匹配律行为神经模型：随机突触导致的改善。

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

作者信息

Soltani Alireza, Wang Xiao-Jing

机构信息

Volen Center for Complex Systems, Department of Physics, Brandeis University, Waltham, Massachusetts 02454, USA.

出版信息

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

DOI:10.1523/JNEUROSCI.5159-05.2006

PMID:16597727

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6674121/

Abstract

In experiments designed to uncover the neural basis of adaptive decision making in a foraging environment, neuroscientists have reported single-cell activities in the lateral intraparietal cortex (LIP) that are correlated with choice options and their subjective values. To investigate the underlying synaptic mechanism, we considered a spiking neuron model of decision making endowed with synaptic plasticity that follows a reward-dependent stochastic Hebbian learning rule. This general model is tested in a matching task in which rewards on two targets are scheduled randomly with different rates. Our main results are threefold. First, we show that plastic synapses provide a natural way to integrate past rewards and estimate the local (in time) "return" of a choice. Second, our model reproduces the matching behavior (i.e., the proportional allocation of choices matches the relative reinforcement obtained on those choices, which is achieved through melioration in individual trials). Our model also explains the observed "undermatching" phenomenon and points to biophysical constraints (such as finite learning rate and stochastic neuronal firing) that set the limits to matching behavior. Third, although our decision model is an attractor network exhibiting winner-take-all competition, it captures graded neural spiking activities observed in LIP, when the latter were sorted according to the choices and the difference in the returns for the two targets. These results suggest that neurons in LIP are involved in selecting the oculomotor responses, whereas rewards are integrated and stored elsewhere, possibly by plastic synapses and in the form of the return rather than income of choice options.

摘要

在旨在揭示觅食环境中适应性决策的神经基础的实验中，神经科学家报告了顶内沟外侧皮质（LIP）中的单细胞活动，这些活动与选择选项及其主观价值相关。为了研究潜在的突触机制，我们考虑了一个具有突触可塑性的决策发放神经元模型，该模型遵循依赖奖励的随机赫布学习规则。这个通用模型在一个匹配任务中进行了测试，其中两个目标上的奖励以不同的速率随机安排。我们的主要结果有三个方面。第一，我们表明可塑性突触提供了一种自然的方式来整合过去的奖励并估计选择的局部（即时）“回报”。第二，我们的模型再现了匹配行为（即选择的比例分配与在这些选择上获得的相对强化相匹配，这是通过单个试验中的改进实现的）。我们的模型还解释了观察到的“欠匹配”现象，并指出了生物物理限制（如有限的学习率和随机的神经元放电）对匹配行为设定了限制。第三，尽管我们的决策模型是一个表现出胜者全得竞争的吸引子网络，但当根据选择以及两个目标回报的差异对顶内沟外侧皮质中观察到的分级神经发放活动进行分类时，它捕捉到了这些活动。这些结果表明，顶内沟外侧皮质中的神经元参与选择动眼反应，而奖励可能通过可塑性突触以回报而非选择选项的收益的形式在其他地方进行整合和存储。

相似文献

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

Statistical mechanics of reward-modulated learning in decision-making networks.

Neural Comput. 2012 May;24(5):1230-70. doi: 10.1162/NECO_a_00264. Epub 2012 Feb 1.

Reinforcement Learning in Spiking Neural Networks with Stochastic and Deterministic Synapses.

Neural Comput. 2019 Dec;31(12):2368-2389. doi: 10.1162/neco_a_01238. Epub 2019 Oct 15.

Learning in neural networks by reinforcement of irregular spiking.

Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Apr;69(4 Pt 1):041909. doi: 10.1103/PhysRevE.69.041909. Epub 2004 Apr 30.

Neuron as a reward-modulated combinatorial switch and a model of learning behavior.

Neural Netw. 2013 Oct;46:62-74. doi: 10.1016/j.neunet.2013.04.010. Epub 2013 May 6.

Operant matching is a generic outcome of synaptic plasticity based on the covariance between reward and neural activity.

Proc Natl Acad Sci U S A. 2006 Oct 10;103(41):15224-9. doi: 10.1073/pnas.0505220103. Epub 2006 Sep 28.

Learning by the dendritic prediction of somatic spiking.

Neuron. 2014 Feb 5;81(3):521-8. doi: 10.1016/j.neuron.2013.11.030.

Reinforcement learning of targeted movement in a spiking neuronal model of motor cortex.

PLoS One. 2012;7(10):e47251. doi: 10.1371/journal.pone.0047251. Epub 2012 Oct 19.

A spiking neural model for stable reinforcement of synapses based on multiple distal rewards.

Neural Comput. 2013 Jan;25(1):123-56. doi: 10.1162/NECO_a_00387. Epub 2012 Sep 28.

Robustness of learning that is based on covariance-driven synaptic plasticity.

PLoS Comput Biol. 2008 Mar 7;4(3):e1000007. doi: 10.1371/journal.pcbi.1000007.

引用本文的文献

Contributions of Attention to Learning in Multidimensional Reward Environments.

J Neurosci. 2025 Feb 12;45(7):e2300232024. doi: 10.1523/JNEUROSCI.2300-23.2024.

Neural Mechanisms Underlying Robust Target Selection in Response to Microstimulation of the Oculomotor System.

J Neurosci. 2025 Jan 15;45(3):e2356232024. doi: 10.1523/JNEUROSCI.2356-23.2024.

Ventrolateral prefrontal cortex in macaques guides decisions in different learning contexts.

bioRxiv. 2024 Sep 19:2024.09.18.613767. doi: 10.1101/2024.09.18.613767.

Brain mechanism of foraging: Reward-dependent synaptic plasticity versus neural integration of values.

Proc Natl Acad Sci U S A. 2024 Apr 2;121(14):e2318521121. doi: 10.1073/pnas.2318521121. Epub 2024 Mar 29.

Matching provides efficient decisions.

bioRxiv. 2024 Feb 19:2024.02.15.580481. doi: 10.1101/2024.02.15.580481.

Reward expectations direct learning and drive operant matching in .

Proc Natl Acad Sci U S A. 2023 Sep 26;120(39):e2221415120. doi: 10.1073/pnas.2221415120. Epub 2023 Sep 21.

Flexible control of representational dynamics in a disinhibition-based model of decision-making.

Elife. 2023 Jun 1;12:e82426. doi: 10.7554/eLife.82426.

The dopamine circuit as a reward-taxis navigation system.

PLoS Comput Biol. 2022 Jul 25;18(7):e1010340. doi: 10.1371/journal.pcbi.1010340. eCollection 2022 Jul.

Entropy-based metrics for predicting choice behavior based on local response to reward.

Nat Commun. 2021 Nov 12;12(1):6567. doi: 10.1038/s41467-021-26784-w.

A dopamine gradient controls access to distributed working memory in the large-scale monkey cortex.

Neuron. 2021 Nov 3;109(21):3500-3520.e13. doi: 10.1016/j.neuron.2021.08.024. Epub 2021 Sep 17.

本文引用的文献

Optimization and the matching law as accounts of instrumental behavior.

J Exp Anal Behav. 1981 Nov;36(3):387-403. doi: 10.1901/jeab.1981.36-387.

Melioration, matching, and maximization.

J Exp Anal Behav. 1981 Sep;36(2):141-9. doi: 10.1901/jeab.1981.36-141.

A Markov model description of changeover probabilities on concurrent variable-interval schedules.

J Exp Anal Behav. 1979 Jan;31(1):41-51. doi: 10.1901/jeab.1979.31-41.

On two types of deviation from the matching law: bias and undermatching.

J Exp Anal Behav. 1974 Jul;22(1):231-42. doi: 10.1901/jeab.1974.22-231.

Linear-Nonlinear-Poisson models of primate choice dynamics.

J Exp Anal Behav. 2005 Nov;84(3):581-617. doi: 10.1901/jeab.2005.23-05.

Dynamic response-by-response models of matching behavior in rhesus monkeys.

J Exp Anal Behav. 2005 Nov;84(3):555-79. doi: 10.1901/jeab.2005.110-04.

Basal ganglia orient eyes to reward.

J Neurophysiol. 2006 Feb;95(2):567-84. doi: 10.1152/jn.00458.2005.

Bidirectional activity-dependent plasticity at corticostriatal synapses.

J Neurosci. 2005 Dec 7;25(49):11279-87. doi: 10.1523/JNEUROSCI.4476-05.2005.

Behavioral theories and the neurophysiology of reward.

Annu Rev Psychol. 2006;57:87-115. doi: 10.1146/annurev.psych.56.091103.070229.

Representation of action-specific reward values in the striatum.

Science. 2005 Nov 25;310(5752):1337-40. doi: 10.1126/science.1115270.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于生物物理学的匹配律行为神经模型：随机突触导致的改善。

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

作者信息

Soltani Alireza, Wang Xiao-Jing

机构信息

Volen Center for Complex Systems, Department of Physics, Brandeis University, Waltham, Massachusetts 02454, USA.

出版信息

J Neurosci. 2006 Apr 5;26(14):3731-44. doi: 10.1523/JNEUROSCI.5159-05.2006.

DOI:10.1523/JNEUROSCI.5159-05.2006

PMID:16597727

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6674121/

Abstract

摘要

基于生物物理学的匹配律行为神经模型：随机突触导致的改善。

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于生物物理学的匹配律行为神经模型：随机突触导致的改善。

A biophysically based neural model of matching law behavior: melioration by stochastic synapses.

作者信息

机构信息

出版信息