Suppr超能文献

鸣禽中的演员-批评家强化学习。

Actor-critic reinforcement learning in the songbird.

机构信息

Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14853, United States.

Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14853, United States.

出版信息

Curr Opin Neurobiol. 2020 Dec;65:1-9. doi: 10.1016/j.conb.2020.08.005. Epub 2020 Sep 6.

Abstract

It feels rewarding to ace your opponent on match point. Here, we propose common mechanisms underlie reward and performance learning. First, when a singing bird unexpectedly hits the right note, its dopamine (DA) neurons are activated as when a thirsty monkey receives an unexpected juice reward. Second, these DA signals reinforce vocal variations much as they reinforce stimulus-response associations. Third, limbic inputs to DA neurons signal the predicted quality of song syllables much like they signal the predicted reward value of a place or a stimulus during foraging. Finally, songbirds may solve difficult problems in reinforcement learning - such as credit assignment and catastrophic forgetting - with node perturbation and consolidation of reinforced vocal patterns in motor cortical circuits. Consolidation occurs downstream of a canonical 'actor-critic' circuit motif that learns to maximize performance quality in essentially the same way it learns to maximize reward: by computing and learning from prediction errors.

摘要

在决胜点击败对手感觉很有成就感。在这里,我们提出了奖励和表现学习的共同机制。首先,当一只唱歌的鸟出人意料地唱出正确的音符时,它的多巴胺(DA)神经元就会像口渴的猴子得到意想不到的果汁奖励时一样被激活。其次,这些 DA 信号像强化刺激-反应关联一样强化声音变化。第三,DA 神经元的边缘输入信号预测歌曲音节的质量,就像它们预测觅食过程中一个地方或刺激的预测奖励价值一样。最后,鸣禽可能会通过节点扰动和强化运动皮质回路中的发声模式来解决强化学习中的难题,例如信用分配和灾难性遗忘。巩固发生在经典的“行动者-批评者”电路模式的下游,该模式通过计算和从预测误差中学习,以基本上与学习最大化奖励相同的方式来学习最大化性能质量。

相似文献

1
Actor-critic reinforcement learning in the songbird.鸣禽中的演员-批评家强化学习。
Curr Opin Neurobiol. 2020 Dec;65:1-9. doi: 10.1016/j.conb.2020.08.005. Epub 2020 Sep 6.
3
Performance-Dependent Consolidation of Learned Vocal Changes in Adult Songbirds.习得的鸣禽发声变化与表现相关的巩固。
J Neurosci. 2022 Mar 9;42(10):1974-1986. doi: 10.1523/JNEUROSCI.1942-21.2021. Epub 2022 Jan 20.
9
A hypothesis for basal ganglia-dependent reinforcement learning in the songbird.鸣禽基底神经节依赖的强化学习假说。
Neuroscience. 2011 Dec 15;198:152-70. doi: 10.1016/j.neuroscience.2011.09.069. Epub 2011 Oct 13.

引用本文的文献

4
Comparative approaches to the neurobiology of avian vocal learning.鸟类发声学习神经生物学的比较研究方法。
Curr Opin Neurobiol. 2025 Jun;92:102993. doi: 10.1016/j.conb.2025.102993. Epub 2025 Mar 4.
5
Social context affects sequence modification learning in birdsong.社会环境影响鸟鸣中的序列修改学习。
Front Psychol. 2025 Feb 5;16:1488762. doi: 10.3389/fpsyg.2025.1488762. eCollection 2025.
9
Effects of stochastic coding on olfactory discrimination in flies and mice.随机编码对果蝇和小鼠嗅觉辨别能力的影响。
PLoS Biol. 2023 Oct 31;21(10):e3002206. doi: 10.1371/journal.pbio.3002206. eCollection 2023 Oct.
10
Feasibility of dopamine as a vector-valued feedback signal in the basal ganglia.多巴胺作为基底神经节中向量值反馈信号的可行性。
Proc Natl Acad Sci U S A. 2023 Aug 8;120(32):e2221994120. doi: 10.1073/pnas.2221994120. Epub 2023 Aug 1.

本文引用的文献

1
Backpropagation and the brain.反向传播与大脑。
Nat Rev Neurosci. 2020 Jun;21(6):335-346. doi: 10.1038/s41583-020-0277-3. Epub 2020 Apr 17.
3
Emergent tuning for learned vocalizations in auditory cortex.听觉皮层中习得发声的紧急调整。
Nat Neurosci. 2019 Sep;22(9):1469-1476. doi: 10.1038/s41593-019-0458-4. Epub 2019 Aug 12.
7
A Basal Ganglia Circuit Sufficient to Guide Birdsong Learning.基底神经节回路足以指导鸟鸣学习。
Neuron. 2018 Apr 4;98(1):208-221.e5. doi: 10.1016/j.neuron.2018.02.020. Epub 2018 Mar 15.
9
Building a state space for song learning.建立一个用于歌曲学习的状态空间。
Curr Opin Neurobiol. 2018 Apr;49:59-68. doi: 10.1016/j.conb.2017.12.001. Epub 2017 Dec 18.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验