自然行为是通过多巴胺介导的强化作用习得的。

Natural behaviour is learned through dopamine-mediated reinforcement.

作者信息

Kasdin Jonathan, Duffy Alison, Nadler Nathan, Raha Arnav, Fairhall Adrienne L, Stachenfeld Kimberly L, Gadagkar Vikram

机构信息

Department of Neuroscience, Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.

Department of Neurobiology and Biophysics and Computational Neuroscience Center, University of Washington, Seattle, WA, USA.

出版信息

Nature. 2025 May;641(8063):699-706. doi: 10.1038/s41586-025-08729-1. Epub 2025 Mar 12.

DOI:10.1038/s41586-025-08729-1

PMID:40074908

Abstract

Many natural motor skills, such as speaking or locomotion, are acquired through a process of trial-and-error learning over the course of development. It has long been hypothesized, motivated by observations in artificial learning experiments, that dopamine has a crucial role in this process. Dopamine in the basal ganglia is thought to guide reward-based trial-and-error learning by encoding reward prediction errors, decreasing after worse-than-predicted reward outcomes and increasing after better-than-predicted ones. Our previous work in adult zebra finches-in which we changed the perceived song quality with distorted auditory feedback-showed that dopamine in Area X, the singing-related basal ganglia, encodes performance prediction error: dopamine is suppressed after worse-than-predicted (distorted syllables) and activated after better-than-predicted (undistorted syllables) performance. However, it remains unknown whether the learning of natural behaviours, such as developmental vocal learning, occurs through dopamine-based reinforcement. Here we tracked song learning trajectories in juvenile zebra finches and used fibre photometry to monitor concurrent dopamine activity in Area X. We found that dopamine was activated after syllable renditions that were closer to the eventual adult version of the song, compared with recent renditions, and suppressed after renditions that were further away. Furthermore, the relationship between dopamine and song fluctuations revealed that dopamine predicted the future evolution of song, suggesting that dopamine drives behaviour. Finally, dopamine activity was explained by the contrast between the quality of the current rendition and the recent history of renditions-consistent with dopamine's hypothesized role in encoding prediction errors in an actor-critic reinforcement-learning model. Reinforcement-learning algorithms have emerged as a powerful class of model to explain learning in reward-based laboratory tasks, as well as for driving autonomous learning in artificial intelligence. Our results suggest that complex natural behaviours in biological systems can also be acquired through dopamine-mediated reinforcement learning.

摘要

许多自然运动技能，如说话或移动，是在发育过程中通过试错学习过程获得的。长期以来，受人工学习实验观察结果的启发，人们一直假设多巴胺在这一过程中起着关键作用。基底神经节中的多巴胺被认为通过编码奖励预测误差来指导基于奖励的试错学习，在奖励结果比预期差时减少，在奖励结果比预期好时增加。我们之前在成年斑胸草雀身上的研究——我们通过扭曲的听觉反馈改变了感知到的歌声质量——表明，与唱歌相关的基底神经节X区域中的多巴胺编码表现预测误差：在表现比预期差（音节扭曲）后多巴胺被抑制，在表现比预期好（音节未扭曲）后被激活。然而，自然行为的学习，如发育性发声学习，是否通过基于多巴胺的强化来发生仍然未知。在这里，我们追踪了幼年斑胸草雀的歌声学习轨迹，并使用光纤光度法监测X区域中同时发生的多巴胺活动。我们发现，与最近的演唱相比，当音节演唱更接近歌曲最终的成年版本时，多巴胺会被激活，而在距离更远的演唱后会被抑制。此外，多巴胺与歌声波动之间的关系表明，多巴胺预测了歌声的未来演变，这表明多巴胺驱动行为。最后，多巴胺活动可以通过当前演唱质量与近期演唱历史之间的对比来解释——这与多巴胺在演员-评论家强化学习模型中编码预测误差的假设作用一致。强化学习算法已成为一类强大的模型，用于解释基于奖励的实验室任务中的学习，以及驱动人工智能中的自主学习。我们的结果表明，生物系统中的复杂自然行为也可以通过多巴胺介导的强化学习来获得。

相似文献

Natural behaviour is learned through dopamine-mediated reinforcement.

Nature. 2025 May;641(8063):699-706. doi: 10.1038/s41586-025-08729-1. Epub 2025 Mar 12.

Dual neuromodulatory dynamics underlie birdsong learning.

Nature. 2025 May;641(8063):690-698. doi: 10.1038/s41586-025-08694-9. Epub 2025 Mar 12.

Short-Term Memory Impairment

Vocal constraints on song amplitude in star finches .

PeerJ. 2025 Jul 10;13:e19705. doi: 10.7717/peerj.19705. eCollection 2025.

Maternal behavior influences vocal practice and learning processes in the greater sac-winged bat.

Elife. 2025 May 13;13:RP99474. doi: 10.7554/eLife.99474.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

A Spectrum of Understanding: A Qualitative Exploration of Autistic Adults' Understandings and Perceptions of Friendship(s).

Autism Adulthood. 2024 Dec 2;6(4):438-450. doi: 10.1089/aut.2023.0051. eCollection 2024 Dec.

Sexual Harassment and Prevention Training

Immunogenicity and seroefficacy of pneumococcal conjugate vaccines: a systematic review and network meta-analysis.

Health Technol Assess. 2024 Jul;28(34):1-109. doi: 10.3310/YWHA3079.

引用本文的文献

Correctness is its own reward: bootstrapping error signals in self-guided reinforcement learning.

bioRxiv. 2025 Aug 19:2025.07.18.665446. doi: 10.1101/2025.07.18.665446.

本文引用的文献

Transient sensorimotor projections in the developmental song learning period.

Cell Rep. 2024 May 28;43(5):114196. doi: 10.1016/j.celrep.2024.114196. Epub 2024 May 7.

Learning the sound inventory of a complex vocal skill via an intrinsic reward.

Sci Adv. 2024 Mar 29;10(13):eadj3824. doi: 10.1126/sciadv.adj3824. Epub 2024 Mar 27.

Daily vocal exercise is necessary for peak performance singing in a songbird.

Nat Commun. 2023 Dec 12;14(1):7787. doi: 10.1038/s41467-023-43592-6.

Improved green and red GRAB sensors for monitoring dopaminergic activity in vivo.

Nat Methods. 2024 Apr;21(4):680-691. doi: 10.1038/s41592-023-02100-w. Epub 2023 Nov 30.

Dopaminergic error signals retune to social feedback during courtship.

Nature. 2023 Nov;623(7986):375-380. doi: 10.1038/s41586-023-06580-w. Epub 2023 Sep 27.

Spontaneous behaviour is structured by reinforcement without explicit reward.

Nature. 2023 Feb;614(7946):108-117. doi: 10.1038/s41586-022-05611-2. Epub 2023 Jan 18.

Birdsong neuroscience and the evolutionary substrates of learned vocalization.

Trends Neurosci. 2023 Feb;46(2):97-99. doi: 10.1016/j.tins.2022.11.005. Epub 2022 Dec 12.

Discovering faster matrix multiplication algorithms with reinforcement learning.

Nature. 2022 Oct;610(7930):47-53. doi: 10.1038/s41586-022-05172-4. Epub 2022 Oct 5.

Dopamine neurons evaluate natural fluctuations in performance quality.

Cell Rep. 2022 Mar 29;38(13):110574. doi: 10.1016/j.celrep.2022.110574.

Fast and accurate annotation of acoustic signals with deep neural networks.

Elife. 2021 Nov 1;10:e68837. doi: 10.7554/eLife.68837.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

自然行为是通过多巴胺介导的强化作用习得的。

Natural behaviour is learned through dopamine-mediated reinforcement.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献