• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多巴胺作为基底神经节中向量值反馈信号的可行性。

Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia.

机构信息

Department of Neuroscience, Karolinska Institutet, 171 77 Stockholm, Sweden.

Division of Computational Science and Technology, School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, 114 28 Stockholm, Sweden.

出版信息

Proc Natl Acad Sci U S A. 2023 Aug 8;120(32):e2221994120. doi: 10.1073/pnas.2221994120. Epub 2023 Aug 1.

DOI:10.1073/pnas.2221994120
PMID:37527344
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10410740/
Abstract

It is well established that midbrain dopaminergic neurons support reinforcement learning (RL) in the basal ganglia by transmitting a reward prediction error (RPE) to the striatum. In particular, different computational models and experiments have shown that a striatum-wide RPE signal can support RL over a small discrete set of actions (e.g., no/no-go, choose left/right). However, there is accumulating evidence that the basal ganglia functions not as a selector between predefined actions but rather as a dynamical system with graded, continuous outputs. To reconcile this view with RL, there is a need to explain how dopamine could support learning of continuous outputs, rather than discrete action values. Inspired by the recent observations that besides RPE, the firing rates of midbrain dopaminergic neurons correlate with motor and cognitive variables, we propose a model in which dopamine signal in the striatum carries a vector-valued error feedback signal (a loss gradient) instead of a homogeneous scalar error (a loss). We implement a local, "three-factor" corticostriatal plasticity rule involving the presynaptic firing rate, a postsynaptic factor, and the unique dopamine concentration perceived by each striatal neuron. With this learning rule, we show that such a vector-valued feedback signal results in an increased capacity to learn a multidimensional series of real-valued outputs. Crucially, we demonstrate that this plasticity rule does not require precise nigrostriatal synapses but remains compatible with experimental observations of random placement of varicosities and diffuse volume transmission of dopamine.

摘要

中脑多巴胺能神经元通过将奖励预测误差 (RPE) 传递到纹状体来支持基底神经节中的强化学习 (RL),这一点已得到充分证实。特别是,不同的计算模型和实验表明,纹状体广泛的 RPE 信号可以支持 RL 在小离散的动作集(例如,无/无反应,选择左/右)上进行。然而,越来越多的证据表明,基底神经节的功能不是作为预定义动作之间的选择器,而是作为具有渐变、连续输出的动力系统。为了将这种观点与 RL 调和起来,需要解释多巴胺如何支持连续输出的学习,而不是离散的动作值。受最近观察到的中脑多巴胺能神经元的放电率与运动和认知变量相关的启示,我们提出了一个模型,其中纹状体中的多巴胺信号携带一个向量误差反馈信号(损失梯度),而不是同质的标量误差(损失)。我们实现了一种局部的、“三因素”皮质纹状体可塑性规则,该规则涉及到突触前放电率、突触后因子和每个纹状体神经元感知到的独特多巴胺浓度。通过这个学习规则,我们表明这种向量误差反馈信号会增加学习多维真实值输出的能力。至关重要的是,我们证明这种可塑性规则不需要精确的黑质纹状体突触,并且仍然与多巴胺的血管球随机放置和弥散体积传递的实验观察结果兼容。

相似文献

1
Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia.多巴胺作为基底神经节中向量值反馈信号的可行性。
Proc Natl Acad Sci U S A. 2023 Aug 8;120(32):e2221994120. doi: 10.1073/pnas.2221994120. Epub 2023 Aug 1.
2
Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits.纹状体多巴胺爬坡可能表明皮质基底神经节回路具有灵活的强化学习和遗忘能力。
Front Neural Circuits. 2014 Apr 9;8:36. doi: 10.3389/fncir.2014.00036. eCollection 2014.
3
The many worlds hypothesis of dopamine prediction error: implications of a parallel circuit architecture in the basal ganglia.多巴胺预测误差的多世界假说:基底神经节中平行电路结构的意义。
Curr Opin Neurobiol. 2017 Oct;46:241-247. doi: 10.1016/j.conb.2017.08.015. Epub 2017 Oct 3.
4
Dopamine role in learning and action inference.多巴胺在学习和行动推断中的作用。
Elife. 2020 Jul 7;9:e53262. doi: 10.7554/eLife.53262.
5
A Dual Role Hypothesis of the Cortico-Basal-Ganglia Pathways: Opponency and Temporal Difference Through Dopamine and Adenosine.皮质-基底神经节通路的双重作用假说:多巴胺和腺苷介导的对立和时间差分。
Front Neural Circuits. 2019 Jan 7;12:111. doi: 10.3389/fncir.2018.00111. eCollection 2018.
6
Computing reward-prediction error: an integrated account of cortical timing and basal-ganglia pathways for appetitive and aversive learning.计算奖励预测误差:关于皮层时间和基底神经节通路在食欲性和厌恶性学习中的综合阐述
Eur J Neurosci. 2015 Aug;42(4):2003-21. doi: 10.1111/ejn.12994. Epub 2015 Jul 25.
7
Basal Ganglia Neuromodulation Over Multiple Temporal and Structural Scales-Simulations of Direct Pathway MSNs Investigate the Fast Onset of Dopaminergic Effects and Predict the Role of Kv4.2.基底神经节的多时间和结构尺度神经调节——直接通路 MSN 的模拟研究多巴胺能效应的快速起始并预测 Kv4.2 的作用。
Front Neural Circuits. 2018 Feb 6;12:3. doi: 10.3389/fncir.2018.00003. eCollection 2018.
8
Striatal action-learning based on dopamine concentration.基于多巴胺浓度的纹状体动作学习。
Exp Brain Res. 2010 Jan;200(3-4):307-17. doi: 10.1007/s00221-009-2060-6. Epub 2009 Nov 11.
9
The place of dopamine in the cortico-basal ganglia circuit.多巴胺在皮质-基底神经节回路中的作用。
Neuroscience. 2014 Dec 12;282:248-57. doi: 10.1016/j.neuroscience.2014.10.008. Epub 2014 Oct 19.
10
Functional Relevance of Different Basal Ganglia Pathways Investigated in a Spiking Model with Reward Dependent Plasticity.在具有奖励依赖可塑性的脉冲模型中研究不同基底神经节通路的功能相关性。
Front Neural Circuits. 2016 Jul 21;10:53. doi: 10.3389/fncir.2016.00053. eCollection 2016.

引用本文的文献

1
Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning.正确性本身就是一种回报:在自我引导的强化学习中引导误差信号。
bioRxiv. 2025 Aug 19:2025.07.18.665446. doi: 10.1101/2025.07.18.665446.
2
A decision-space model explains context-specific decision-making.决策空间模型解释特定情境下的决策制定。
Nat Commun. 2025 Aug 14;16(1):7437. doi: 10.1038/s41467-025-61466-x.
3
Dynamics of striatal action selection and reinforcement learning.纹状体动作选择与强化学习的动态变化

本文引用的文献

1
A feature-specific prediction error model explains dopaminergic heterogeneity.一种具有特征特异性的预测误差模型解释了多巴胺能异质性。
Nat Neurosci. 2024 Aug;27(8):1574-1586. doi: 10.1038/s41593-024-01689-1. Epub 2024 Jul 3.
2
Distributional coding of associative learning in discrete populations of midbrain dopamine neurons.中脑多巴胺神经元离散群体中联想学习的分布式编码。
Cell Rep. 2024 Apr 23;43(4):114080. doi: 10.1016/j.celrep.2024.114080. Epub 2024 Apr 4.
3
An action potential initiation mechanism in distal axons for the control of dopamine release.
Elife. 2025 May 8;13:RP101747. doi: 10.7554/eLife.101747.
4
The Computational Bottleneck of Basal Ganglia Output (and What to Do About it).基底神经节输出的计算瓶颈(以及应对方法)。
eNeuro. 2025 Apr 24;12(4). doi: 10.1523/ENEURO.0431-23.2024. Print 2025 Apr.
5
A statistical framework for analysis of trial-level temporal dynamics in fiber photometry experiments.用于分析光纤光度测量实验中试验水平时间动态的统计框架。
Elife. 2025 Mar 12;13:RP95802. doi: 10.7554/eLife.95802.
6
Reward Bases: A simple mechanism for adaptive acquisition of multiple reward types.奖励基础:一种用于适应性获取多种奖励类型的简单机制。
PLoS Comput Biol. 2024 Nov 19;20(11):e1012580. doi: 10.1371/journal.pcbi.1012580. eCollection 2024 Nov.
7
Hierarchical behavior control by a single class of interneurons.由单一类中间神经元实现的层级行为控制。
Proc Natl Acad Sci U S A. 2024 Nov 19;121(47):e2410789121. doi: 10.1073/pnas.2410789121. Epub 2024 Nov 12.
8
Distinct dopaminergic spike-timing-dependent plasticity rules are suited to different functional roles.不同的多巴胺能峰电位时间依赖性可塑性规则适用于不同的功能角色。
bioRxiv. 2024 Oct 4:2024.06.24.600372. doi: 10.1101/2024.06.24.600372.
9
Dynamics of striatal action selection and reinforcement learning.纹状体动作选择与强化学习的动态变化
bioRxiv. 2024 Dec 24:2024.02.14.580408. doi: 10.1101/2024.02.14.580408.
10
A Statistical Framework for Analysis of Trial-Level Temporal Dynamics in Fiber Photometry Experiments.纤维光度实验中试验水平时间动态分析的统计框架
bioRxiv. 2024 Oct 19:2023.11.06.565896. doi: 10.1101/2023.11.06.565896.
用于控制多巴胺释放的远端轴突中的动作电位起始机制。
Science. 2022 Mar 25;375(6587):1378-1385. doi: 10.1126/science.abn0532. Epub 2022 Mar 24.
4
Disruption of mitochondrial complex I induces progressive parkinsonism.线粒体复合物 I 的破坏会导致进行性帕金森病。
Nature. 2021 Nov;599(7886):650-656. doi: 10.1038/s41586-021-04059-0. Epub 2021 Nov 3.
5
The mouse cortico-basal ganglia-thalamic network.鼠大脑皮层-基底神经节-丘脑网络。
Nature. 2021 Oct;598(7879):188-194. doi: 10.1038/s41586-021-03993-3. Epub 2021 Oct 6.
6
Complete representation of action space and value in all dorsal striatal pathways.在所有背侧纹状体通路上完全表示动作空间和价值。
Cell Rep. 2021 Jul 27;36(4):109437. doi: 10.1016/j.celrep.2021.109437.
7
The basal ganglia control the detailed kinematics of learned motor skills.基底神经节控制着习得运动技能的详细运动学。
Nat Neurosci. 2021 Sep;24(9):1256-1269. doi: 10.1038/s41593-021-00889-3. Epub 2021 Jul 15.
8
Wave-like dopamine dynamics as a mechanism for spatiotemporal credit assignment.波状多巴胺动力学作为时空信用分配的机制。
Cell. 2021 May 13;184(10):2733-2749.e16. doi: 10.1016/j.cell.2021.03.046. Epub 2021 Apr 15.
9
Spatial and temporal scales of dopamine transmission.多巴胺传递的时空尺度。
Nat Rev Neurosci. 2021 Jun;22(6):345-358. doi: 10.1038/s41583-021-00455-7. Epub 2021 Apr 9.
10
A Unified Framework for Dopamine Signals across Timescales.多巴胺信号的跨时间尺度统一框架。
Cell. 2020 Dec 10;183(6):1600-1616.e25. doi: 10.1016/j.cell.2020.11.013. Epub 2020 Nov 27.