一种具有特征特异性的预测误差模型解释了多巴胺能异质性。

A feature-specific prediction error model explains dopaminergic heterogeneity.

机构信息

Princeton Neuroscience Institute, Princeton, NJ, USA.

Department of Psychology, Princeton University, Princeton, NJ, USA.

出版信息

Nat Neurosci. 2024 Aug;27(8):1574-1586. doi: 10.1038/s41593-024-01689-1. Epub 2024 Jul 3.

DOI:10.1038/s41593-024-01689-1

PMID:38961229

Abstract

The hypothesis that midbrain dopamine (DA) neurons broadcast a reward prediction error (RPE) is among the great successes of computational neuroscience. However, recent results contradict a core aspect of this theory: specifically that the neurons convey a scalar, homogeneous signal. While the predominant family of extensions to the RPE model replicates the classic model in multiple parallel circuits, we argue that these models are ill suited to explain reports of heterogeneity in task variable encoding across DA neurons. Instead, we introduce a complementary 'feature-specific RPE' model, positing that individual ventral tegmental area DA neurons report RPEs for different aspects of an animal's moment-to-moment situation. Further, we show how our framework can be extended to explain patterns of heterogeneity in action responses reported among substantia nigra pars compacta DA neurons. This theory reconciles new observations of DA heterogeneity with classic ideas about RPE coding while also providing a new perspective of how the brain performs reinforcement learning in high-dimensional environments.

摘要

中脑多巴胺（DA）神经元广播奖励预测误差（RPE）的假设是计算神经科学的重大成功之一。然而，最近的结果与该理论的一个核心方面相矛盾：具体来说，神经元传递的是标量、同质性信号。虽然 RPE 模型的主要扩展家族在多个并行电路中复制了经典模型，但我们认为这些模型不适合解释 DA 神经元在任务变量编码方面异质性的报告。相反，我们引入了一个互补的“特征特定 RPE”模型，假设单个腹侧被盖区 DA 神经元报告动物当前状态的不同方面的 RPE。此外，我们展示了如何扩展我们的框架来解释在黑质致密部 DA 神经元中报告的动作反应异质性模式。该理论将 DA 异质性的新观察结果与关于 RPE 编码的经典思想相协调，同时也为大脑在高维环境中进行强化学习提供了新的视角。

相似文献

A feature-specific prediction error model explains dopaminergic heterogeneity.

Nat Neurosci. 2024 Aug;27(8):1574-1586. doi: 10.1038/s41593-024-01689-1. Epub 2024 Jul 3.

Ventral Tegmental Dopamine Neurons Participate in Reward Identity Predictions.

Curr Biol. 2019 Jan 7;29(1):93-103.e3. doi: 10.1016/j.cub.2018.11.050. Epub 2018 Dec 20.

Striatal dopamine ramping may indicate flexible reinforcement learning with forgetting in the cortico-basal ganglia circuits.

Front Neural Circuits. 2014 Apr 9;8:36. doi: 10.3389/fncir.2014.00036. eCollection 2014.

A distributional code for value in dopamine-based reinforcement learning.

Nature. 2020 Jan;577(7792):671-675. doi: 10.1038/s41586-019-1924-6. Epub 2020 Jan 15.

Neuronal implementation of the temporal difference learning algorithm in the midbrain dopaminergic system.

Proc Natl Acad Sci U S A. 2023 Nov 7;120(45):e2309015120. doi: 10.1073/pnas.2309015120. Epub 2023 Oct 30.

Cue and Reward Evoked Dopamine Activity Is Necessary for Maintaining Learned Pavlovian Associations.

J Neurosci. 2021 Jun 9;41(23):5004-5014. doi: 10.1523/JNEUROSCI.2744-20.2021. Epub 2021 Apr 22.

From Prediction to Action: Dissociable Roles of Ventral Tegmental Area and Substantia Nigra Dopamine Neurons in Instrumental Reinforcement.

J Neurosci. 2023 May 24;43(21):3895-3908. doi: 10.1523/JNEUROSCI.0028-23.2023. Epub 2023 Apr 25.

Can the apparent adaptation of dopamine neurons' mismatch sensitivities be reconciled with their computation of reward prediction errors?

Neurosci Lett. 2008 Jun 13;438(1):14-6. doi: 10.1016/j.neulet.2008.04.059. Epub 2008 Apr 22.

Dopaminergic Neurons and Brain Reward Pathways: From Neurogenesis to Circuit Assembly.

Am J Pathol. 2016 Mar;186(3):478-88. doi: 10.1016/j.ajpath.2015.09.023. Epub 2015 Dec 24.

Occasion setters determine responses of putative DA neurons to discriminative stimuli.

Neurobiol Learn Mem. 2020 Sep;173:107270. doi: 10.1016/j.nlm.2020.107270. Epub 2020 Jun 19.

引用本文的文献

Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning.

bioRxiv. 2025 Aug 19:2025.07.18.665446. doi: 10.1101/2025.07.18.665446.

Striatal Gradient in Value-Decay Explains Regional Differences in Dopamine Patterns and Reinforcement Learning Computations.

J Neurosci. 2025 Jul 18. doi: 10.1523/JNEUROSCI.0170-25.2025.

Trial-by-trial learning of successor representations in human behavior.

bioRxiv. 2025 Jun 16:2024.11.07.622528. doi: 10.1101/2024.11.07.622528.

Striatal dopamine signals errors in prediction across different informational domains.

Sci Adv. 2025 Jul 11;11(28):eadq9684. doi: 10.1126/sciadv.adq9684. Epub 2025 Jul 9.

A Multi-Region Brain Model to Elucidate the Role of Hippocampus in Spatially Embedded Decision-Making.

bioRxiv. 2025 May 29:2025.05.29.656671. doi: 10.1101/2025.05.29.656671.

The interoceptive origin of reinforcement learning.

Trends Cogn Sci. 2025 Sep;29(9):840-854. doi: 10.1016/j.tics.2025.05.008. Epub 2025 Jun 10.

A multidimensional distributional map of future reward in dopamine neurons.

Nature. 2025 Jun;642(8068):691-699. doi: 10.1038/s41586-025-09089-6. Epub 2025 Jun 4.

Multi-timescale reinforcement learning in the brain.

Nature. 2025 Jun 4. doi: 10.1038/s41586-025-08929-9.

A corticostriatal learning mechanism linking excess striatal dopamine and auditory hallucinations.

bioRxiv. 2025 Mar 18:2025.03.18.643990. doi: 10.1101/2025.03.18.643990.

A prospective code for value in the serotonin system.

Nature. 2025 May;641(8064):952-959. doi: 10.1038/s41586-025-08731-7. Epub 2025 Mar 26.

本文引用的文献

Distributional coding of associative learning in discrete populations of midbrain dopamine neurons.

Cell Rep. 2024 Apr 23;43(4):114080. doi: 10.1016/j.celrep.2024.114080. Epub 2024 Apr 4.

Dopamine transients follow a striatal gradient of reward time horizons.

Nat Neurosci. 2024 Apr;27(4):737-746. doi: 10.1038/s41593-023-01566-3. Epub 2024 Feb 6.

Distributed processing for value-based choice by prelimbic circuits targeting anterior-posterior dorsal striatal subregions in male mice.

Nat Commun. 2023 Apr 6;14(1):1920. doi: 10.1038/s41467-023-36795-4.

Striatal dopamine explains novelty-induced behavioral dynamics and individual variability in threat prediction.

Neuron. 2022 Nov 16;110(22):3789-3804.e9. doi: 10.1016/j.neuron.2022.08.022. Epub 2022 Sep 20.

Choice-selective sequences dominate in cortical relative to thalamic inputs to NAc to support reinforcement learning.

Cell Rep. 2022 May 17;39(7):110756. doi: 10.1016/j.celrep.2022.110756.

The role of state uncertainty in the dynamics of dopamine.

Curr Biol. 2022 Mar 14;32(5):1077-1087.e9. doi: 10.1016/j.cub.2022.01.025. Epub 2022 Feb 2.

Context-dependent representations of movement in Drosophila dopaminergic reinforcement pathways.

Nat Neurosci. 2021 Nov;24(11):1555-1566. doi: 10.1038/s41593-021-00929-y. Epub 2021 Oct 25.

Models of heterogeneous dopamine signaling in an insect learning and memory center.

PLoS Comput Biol. 2021 Aug 10;17(8):e1009205. doi: 10.1371/journal.pcbi.1009205. eCollection 2021 Aug.

Dopamine Axons in Dorsal Striatum Encode Contralateral Visual Stimuli and Choices.

J Neurosci. 2021 Aug 25;41(34):7197-7205. doi: 10.1523/JNEUROSCI.0490-21.2021. Epub 2021 Jul 12.

Wave-like dopamine dynamics as a mechanism for spatiotemporal credit assignment.

Cell. 2021 May 13;184(10):2733-2749.e16. doi: 10.1016/j.cell.2021.03.046. Epub 2021 Apr 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种具有特征特异性的预测误差模型解释了多巴胺能异质性。

A feature-specific prediction error model explains dopaminergic heterogeneity.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献