多巴胺阻断会损害大鼠的探索-利用权衡。

Dopamine blockade impairs the exploration-exploitation trade-off in rats.

机构信息

Institut des Systèmes Intelligents et de Robotique, Sorbonne Université, CNRS, F-75005, Paris, France.

CNRS, Institut de Neurosciences Cognitives et Intégratives d'Aquitaine (INCIA, UMR 5287), Bordeaux, France.

出版信息

Sci Rep. 2019 May 1;9(1):6770. doi: 10.1038/s41598-019-43245-z.

DOI:10.1038/s41598-019-43245-z

PMID:31043685

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6494917/

Abstract

In a volatile environment where rewards are uncertain, successful performance requires a delicate balance between exploitation of the best option and exploration of alternative choices. It has theoretically been proposed that dopamine contributes to the control of this exploration-exploitation trade-off, specifically that the higher the level of tonic dopamine, the more exploitation is favored. We demonstrate here that there is a formal relationship between the rescaling of dopamine positive reward prediction errors and the exploration-exploitation trade-off in simple non-stationary multi-armed bandit tasks. We further show in rats performing such a task that systemically antagonizing dopamine receptors greatly increases the number of random choices without affecting learning capacities. Simulations and comparison of a set of different computational models (an extended Q-learning model, a directed exploration model, and a meta-learning model) fitted on each individual confirm that, independently of the model, decreasing dopaminergic activity does not affect learning rate but is equivalent to an increase in random exploration rate. This study shows that dopamine could adapt the exploration-exploitation trade-off in decision-making when facing changing environmental contingencies.

摘要

在一个奖励不确定的不稳定环境中，成功的表现需要在利用最佳选项和探索替代选择之间取得微妙的平衡。理论上提出，多巴胺有助于控制这种探索-利用权衡，具体来说，多巴胺的紧张水平越高，越有利于利用。我们在这里证明，在简单的非平稳多臂赌博任务中，多巴胺正奖励预测误差的缩放与探索-利用权衡之间存在正式关系。我们进一步在执行此类任务的大鼠中表明，系统拮抗多巴胺受体大大增加了随机选择的数量，而不会影响学习能力。对一组不同计算模型（扩展 Q 学习模型、定向探索模型和元学习模型）进行的模拟和比较，并对每个个体进行拟合，证实无论模型如何，降低多巴胺能活动都不会影响学习率，但相当于随机探索率的增加。这项研究表明，当面临不断变化的环境关联时，多巴胺可以调整决策中的探索-利用权衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ebdd/6494917/77521457867a/41598_2019_43245_Fig1_HTML.jpg

相似文献

Dopamine blockade impairs the exploration-exploitation trade-off in rats.

Sci Rep. 2019 May 1;9(1):6770. doi: 10.1038/s41598-019-43245-z.

Uncertainty and exploration in a restless bandit problem.

Top Cogn Sci. 2015 Apr;7(2):351-67. doi: 10.1111/tops.12145. Epub 2015 Apr 20.

Dopaminergic modulation of the exploration/exploitation trade-off in human decision-making.

Elife. 2020 Jun 2;9:e51260. doi: 10.7554/eLife.51260.

Dopaminergic Control of the Exploration-Exploitation Trade-Off via the Basal Ganglia.

Front Neurosci. 2012 Feb 6;6:9. doi: 10.3389/fnins.2012.00009. eCollection 2012.

Regulation of reinforcement learning parameters captures long-term changes in rat behaviour.

Eur J Neurosci. 2024 Aug;60(4):4469-4490. doi: 10.1111/ejn.16449. Epub 2024 Jun 24.

Finding structure in multi-armed bandits.

Cogn Psychol. 2020 Jun;119:101261. doi: 10.1016/j.cogpsych.2019.101261. Epub 2020 Feb 12.

Sex differences in learning from exploration.

Elife. 2021 Nov 19;10:e69748. doi: 10.7554/eLife.69748.

Chronic nicotine increases midbrain dopamine neuron activity and biases individual strategies towards reduced exploration in mice.

Nat Commun. 2021 Nov 26;12(1):6945. doi: 10.1038/s41467-021-27268-7.

Attenuated Directed Exploration during Reinforcement Learning in Gambling Disorder.

J Neurosci. 2021 Mar 17;41(11):2512-2522. doi: 10.1523/JNEUROSCI.1607-20.2021. Epub 2021 Feb 2.

Transcranial Stimulation over Frontopolar Cortex Elucidates the Choice Attributes and Neural Mechanisms Used to Resolve Exploration-Exploitation Trade-Offs.

J Neurosci. 2015 Oct 28;35(43):14544-56. doi: 10.1523/JNEUROSCI.2322-15.2015.

引用本文的文献

TMS-EEG evidence links random exploration to inhibitory mechanisms in the dorsolateral prefrontal cortex.

Sci Rep. 2025 May 5;15(1):15654. doi: 10.1038/s41598-025-00034-1.

A subcortical switchboard for perseverative, exploratory and disengaged states.

Nature. 2025 May;641(8061):151-161. doi: 10.1038/s41586-025-08672-1. Epub 2025 Mar 5.

Stochastic decisions support optimal foraging of volatile environments, and are disrupted by anxiety.

Cogn Affect Behav Neurosci. 2025 Jan 9. doi: 10.3758/s13415-024-01256-y.

Altered trial-to-trial responses to reward outcomes in KCNMA1 knockout mice during probabilistic learning tasks.

Behav Brain Funct. 2024 Dec 28;20(1):36. doi: 10.1186/s12993-024-00262-x.

Simulated synapse loss induces depression-like behaviors in deep reinforcement learning.

Front Comput Neurosci. 2024 Nov 6;18:1466364. doi: 10.3389/fncom.2024.1466364. eCollection 2024.

You are How You Eat: Foraging Behavior as a Potential Novel Marker of Rat Affective State.

Affect Sci. 2024 Jun 26;5(3):232-245. doi: 10.1007/s42761-024-00242-4. eCollection 2024 Sep.

Dopamine and Norepinephrine Differentially Mediate the Exploration-Exploitation Tradeoff.

J Neurosci. 2024 Oct 30;44(44):e1194232024. doi: 10.1523/JNEUROSCI.1194-23.2024.

Exploration-exploitation model of moth-inspired olfactory navigation.

J R Soc Interface. 2024 Jul;21(216):20230746. doi: 10.1098/rsif.2023.0746. Epub 2024 Jul 17.

Dopamine encoding of novelty facilitates efficient uncertainty-driven exploration.

PLoS Comput Biol. 2024 Apr 16;20(4):e1011516. doi: 10.1371/journal.pcbi.1011516. eCollection 2024 Apr.

Reward-based option competition in human dorsal stream and transition from stochastic exploration to exploitation in continuous space.

Sci Adv. 2024 Feb 23;10(8):eadj2219. doi: 10.1126/sciadv.adj2219.

本文引用的文献

Dopaminergic genes are associated with both directed and random exploration.

Neuropsychologia. 2018 Nov;120:97-104. doi: 10.1016/j.neuropsychologia.2018.10.009. Epub 2018 Oct 19.

The timing of action determines reward prediction signals in identified midbrain dopamine neurons.

Nat Neurosci. 2018 Nov;21(11):1563-1573. doi: 10.1038/s41593-018-0245-7. Epub 2018 Oct 15.

Learning the value of information and reward over time when solving exploration-exploitation problems.

Sci Rep. 2017 Dec 5;7(1):16919. doi: 10.1038/s41598-017-17237-w.

A causal role for right frontopolar cortex in directed, but not random, exploration.

Elife. 2017 Sep 15;6:e27430. doi: 10.7554/eLife.27430.

Prefrontal Dopamine D and D Receptors Regulate Dissociable Aspects of Decision Making via Distinct Ventral Striatal and Amygdalar Circuits.

J Neurosci. 2017 Jun 28;37(26):6200-6213. doi: 10.1523/JNEUROSCI.0030-17.2017. Epub 2017 May 25.

The Importance of Falsification in Computational Cognitive Modeling.

Trends Cogn Sci. 2017 Jun;21(6):425-433. doi: 10.1016/j.tics.2017.03.011. Epub 2017 May 2.

Neural Circuitry of Reward Prediction Error.

Annu Rev Neurosci. 2017 Jul 25;40:373-394. doi: 10.1146/annurev-neuro-072116-031109. Epub 2017 Apr 24.

Motivational neural circuits underlying reinforcement learning.

Nat Neurosci. 2017 Mar 29;20(4):505-512. doi: 10.1038/nn.4506.

Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum.

J Neurosci. 2017 Feb 15;37(7):1708-1720. doi: 10.1523/JNEUROSCI.1979-16.2016.

Catecholaminergic Regulation of Learning Rate in a Dynamic Environment.

PLoS Comput Biol. 2016 Oct 28;12(10):e1005171. doi: 10.1371/journal.pcbi.1005171. eCollection 2016 Oct.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多巴胺阻断会损害大鼠的探索-利用权衡。

Dopamine blockade impairs the exploration-exploitation trade-off in rats.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献