Suppr超能文献

多巴胺能动作预测误差作为一种无价值的教学信号。

Dopaminergic action prediction errors serve as a value-free teaching signal.

作者信息

Greenstreet Francesca, Vergara Hernando Martinez, Johansson Yvonne, Pati Sthitapranjya, Schwarz Laura, Lenzi Stephen C, Geerts Jesse P, Wisdom Matthew, Gubanova Alina, Rollik Lars B, Kaur Jasvin, Moskovitz Theodore, Cohen Joseph, Thompson Emmett, Margrie Troy W, Clopath Claudia, Stephenson-Jones Marcus

机构信息

Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, UK.

Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain.

出版信息

Nature. 2025 May 14. doi: 10.1038/s41586-025-09008-9.

Abstract

Choice behaviour of animals is characterized by two main tendencies: taking actions that led to rewards and repeating past actions. Theory suggests that these strategies may be reinforced by different types of dopaminergic teaching signals: reward prediction error to reinforce value-based associations and movement-based action prediction errors to reinforce value-free repetitive associations. Here we use an auditory discrimination task in mice to show that movement-related dopamine activity in the tail of the striatum encodes the hypothesized action prediction error signal. Causal manipulations reveal that this prediction error serves as a value-free teaching signal that supports learning by reinforcing repeated associations. Computational modelling and experiments demonstrate that action prediction errors alone cannot support reward-guided learning, but when paired with the reward prediction error circuitry they serve to consolidate stable sound-action associations in a value-free manner. Together we show that there are two types of dopaminergic prediction errors that work in tandem to support learning, each reinforcing different types of association in different striatal areas.

摘要

动物的选择行为具有两种主要倾向

采取能带来奖励的行动以及重复过去的行动。理论表明,这些策略可能会被不同类型的多巴胺能教学信号强化:奖励预测误差用于强化基于价值的关联,而基于运动的动作预测误差用于强化无价值的重复关联。在此,我们利用小鼠的听觉辨别任务来表明,纹状体尾部与运动相关的多巴胺活动编码了假设的动作预测误差信号。因果操纵表明,这种预测误差作为一种无价值的教学信号,通过强化重复关联来支持学习。计算建模和实验表明,仅动作预测误差无法支持奖励引导的学习,但当与奖励预测误差电路配对时,它们以无价值的方式巩固稳定的声音 - 动作关联。我们共同表明,有两种类型的多巴胺能预测误差协同作用以支持学习,每种误差在不同的纹状体区域强化不同类型的关联。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验