在指导他人时的替代性强化学习信号。

Vicarious reinforcement learning signals when instructing others.

作者信息

Apps Matthew A J, Lesage Elise, Ramnani Narender

机构信息

Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford OX1 9DU, United Kingdom, Department of Experimental Psychology, University of Oxford, Oxford OX1 2JD, United Kingdom, Department of Psychology, Royal Holloway, University of London, Surrey TW20 0EX, United Kingdom, and

Department of Psychology, Royal Holloway, University of London, Surrey TW20 0EX, United Kingdom, and Neuroimaging Research Branch, Intramural Research Program, National Institute on Drug Abuse, National Institutes of Health, Baltimore, Maryland 21224.

出版信息

J Neurosci. 2015 Feb 18;35(7):2904-13. doi: 10.1523/JNEUROSCI.3669-14.2015.

DOI:10.1523/JNEUROSCI.3669-14.2015

PMID:25698730

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4331622/

Abstract

Reinforcement learning (RL) theory posits that learning is driven by discrepancies between the predicted and actual outcomes of actions (prediction errors [PEs]). In social environments, learning is often guided by similar RL mechanisms. For example, teachers monitor the actions of students and provide feedback to them. This feedback evokes PEs in students that guide their learning. We report the first study that investigates the neural mechanisms that underpin RL signals in the brain of a teacher. Neurons in the anterior cingulate cortex (ACC) signal PEs when learning from the outcomes of one's own actions but also signal information when outcomes are received by others. Does a teacher's ACC signal PEs when monitoring a student's learning? Using fMRI, we studied brain activity in human subjects (teachers) as they taught a confederate (student) action-outcome associations by providing positive or negative feedback. We examined activity time-locked to the students' responses, when teachers infer student predictions and know actual outcomes. We fitted a RL-based computational model to the behavior of the student to characterize their learning, and examined whether a teacher's ACC signals when a student's predictions are wrong. In line with our hypothesis, activity in the teacher's ACC covaried with the PE values in the model. Additionally, activity in the teacher's insula and ventromedial prefrontal cortex covaried with the predicted value according to the student. Our findings highlight that the ACC signals PEs vicariously for others' erroneous predictions, when monitoring and instructing their learning. These results suggest that RL mechanisms, processed vicariously, may underpin and facilitate teaching behaviors.

摘要

强化学习（RL）理论认为，学习是由行动的预测结果与实际结果之间的差异（预测误差[PEs]）驱动的。在社会环境中，学习通常由类似的强化学习机制引导。例如，教师会监控学生的行为并给予他们反馈。这种反馈会在学生中引发预测误差，从而引导他们的学习。我们报告了第一项研究，该研究调查了教师大脑中强化学习信号背后的神经机制。前扣带回皮质（ACC）中的神经元在从自身行动结果中学习时会发出预测误差信号，但在他人接收结果时也会发出信息信号。当教师监控学生的学习时，其ACC会发出预测误差信号吗？我们使用功能磁共振成像（fMRI）研究了人类受试者（教师）在通过提供正面或负面反馈来教授一名同伙（学生）行动-结果关联时的大脑活动。我们检查了与学生反应时间锁定的活动，此时教师推断学生的预测并知道实际结果。我们将基于强化学习的计算模型应用于学生的行为，以表征他们的学习情况，并检查当学生的预测错误时教师的ACC是否发出信号。与我们的假设一致，教师ACC中的活动与模型中的预测误差值相关。此外，教师脑岛和腹内侧前额叶皮质的活动与根据学生情况预测的值相关。我们的研究结果表明，在监控和指导他人学习时，ACC会替代他人的错误预测发出预测误差信号。这些结果表明，通过替代方式处理的强化学习机制可能是教学行为的基础并促进教学行为。

相似文献

Vicarious reinforcement learning signals when instructing others.

J Neurosci. 2015 Feb 18;35(7):2904-13. doi: 10.1523/JNEUROSCI.3669-14.2015.

Reinforcement learning signals in the anterior cingulate cortex code for others' false beliefs.

Neuroimage. 2013 Jan 1;64:1-9. doi: 10.1016/j.neuroimage.2012.09.010. Epub 2012 Sep 13.

The anterior cingulate cortex: monitoring the outcomes of others' decisions.

Soc Neurosci. 2012 Jul;7(4):424-35. doi: 10.1080/17470919.2011.638799. Epub 2011 Nov 25.

Encoding of Vicarious Reward Prediction in Anterior Cingulate Cortex and Relationship with Trait Empathy.

J Neurosci. 2015 Oct 7;35(40):13720-7. doi: 10.1523/JNEUROSCI.1703-15.2015.

Processing of action- but not stimulus-related prediction errors differs between active and observational feedback learning.

Neuropsychologia. 2015 Jan;66:75-87. doi: 10.1016/j.neuropsychologia.2014.10.036. Epub 2014 Nov 7.

The anterior cingulate gyrus signals the net value of others' rewards.

J Neurosci. 2014 Apr 30;34(18):6190-200. doi: 10.1523/JNEUROSCI.2701-13.2014.

Stimulus-outcome learnability differentially activates anterior cingulate and hippocampus at feedback processing.

Learn Mem. 2009 Apr 29;16(5):324-31. doi: 10.1101/lm.1191609. Print 2009 May.

Learned predictions of error likelihood in the anterior cingulate cortex.

Science. 2005 Feb 18;307(5712):1118-21. doi: 10.1126/science.1105783.

Reduced error-related activation in two anterior cingulate circuits is related to impaired performance in schizophrenia.

Brain. 2008 Apr;131(Pt 4):971-86. doi: 10.1093/brain/awm307. Epub 2007 Dec 24.

Value and prediction error estimation account for volatility effects in ACC: a model-based fMRI study.

Cortex. 2013 Jun;49(6):1627-35. doi: 10.1016/j.cortex.2012.05.008. Epub 2012 May 26.

引用本文的文献

The Effects of Teacher Rewards and Their Types on Preschool Children's Selective Trust.

Behav Sci (Basel). 2025 Jun 12;15(6):804. doi: 10.3390/bs15060804.

Social Risk Coding by Amygdala Activity and Connectivity with the Dorsal Anterior Cingulate Cortex.

J Neurosci. 2025 Jan 29;45(5):e1149242024. doi: 10.1523/JNEUROSCI.1149-24.2024.

Observational reinforcement learning in children and young adults.

NPJ Sci Learn. 2024 Mar 13;9(1):18. doi: 10.1038/s41539-024-00227-9.

Expecting the Unexpected: Infants Use Others' Surprise to Revise Their Own Expectations.

Open Mind (Camb). 2024 Mar 1;8:67-83. doi: 10.1162/opmi_a_00117. eCollection 2024.

The cultural evolution of teaching.

Evol Hum Sci. 2023 May 12;5:e14. doi: 10.1017/ehs.2023.14. eCollection 2023.

Dissociation of vicarious and experienced rewards by coupling frequency within the same neural pathway.

Neuron. 2023 Aug 16;111(16):2513-2522.e4. doi: 10.1016/j.neuron.2023.05.020. Epub 2023 Jun 21.

How we learn social norms: a three-stage model for social norm learning.

Front Psychol. 2023 Jun 2;14:1153809. doi: 10.3389/fpsyg.2023.1153809. eCollection 2023.

Teachers recruit mentalizing regions to represent learners' beliefs.

Proc Natl Acad Sci U S A. 2023 May 30;120(22):e2215015120. doi: 10.1073/pnas.2215015120. Epub 2023 May 22.

Neural implementation of computational mechanisms underlying the continuous trade-off between cooperation and competition.

Nat Commun. 2022 Nov 11;13(1):6873. doi: 10.1038/s41467-022-34509-w.

Distinct neural representations for prosocial and self-benefiting effort.

Curr Biol. 2022 Oct 10;32(19):4172-4185.e7. doi: 10.1016/j.cub.2022.08.010. Epub 2022 Aug 26.

本文引用的文献

The Ultimatum Game and the brain: a meta-analysis of neuroimaging studies.

Neurosci Biobehav Rev. 2014 Nov;47:549-58. doi: 10.1016/j.neubiorev.2014.10.014.

The neurobiology of rewards and values in social decision making.

Nat Rev Neurosci. 2014 Aug;15(8):549-62. doi: 10.1038/nrn3776. Epub 2014 Jul 2.

The anterior cingulate gyrus signals the net value of others' rewards.

J Neurosci. 2014 Apr 30;34(18):6190-200. doi: 10.1523/JNEUROSCI.2701-13.2014.

Social learning in humans and other animals.

Front Neurosci. 2014 Mar 31;8:58. doi: 10.3389/fnins.2014.00058. eCollection 2014.

The role of the midcingulate cortex in monitoring others' decisions.

Front Neurosci. 2013 Dec 20;7:251. doi: 10.3389/fnins.2013.00251. eCollection 2013.

The behavioral and neural mechanisms underlying the tracking of expertise.

Neuron. 2013 Dec 18;80(6):1558-71. doi: 10.1016/j.neuron.2013.10.024.

The role of the striatum in social behavior.

Front Neurosci. 2013 Dec 10;7:233. doi: 10.3389/fnins.2013.00233.

From conflict management to reward-based decision making: actors and critics in primate medial frontal cortex.

Neurosci Biobehav Rev. 2014 Oct;46 Pt 1:44-57. doi: 10.1016/j.neubiorev.2013.11.003. Epub 2013 Nov 15.

Toward a neural basis for social behavior.

Neuron. 2013 Oct 30;80(3):816-26. doi: 10.1016/j.neuron.2013.10.038.

Activity of striatal neurons reflects social action and own reward.

Proc Natl Acad Sci U S A. 2013 Oct 8;110(41):16634-9. doi: 10.1073/pnas.1211342110. Epub 2013 Sep 23.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在指导他人时的替代性强化学习信号。

Vicarious reinforcement learning signals when instructing others.

作者信息

Apps Matthew A J, Lesage Elise, Ramnani Narender

机构信息

出版信息

J Neurosci. 2015 Feb 18;35(7):2904-13. doi: 10.1523/JNEUROSCI.3669-14.2015.

DOI:10.1523/JNEUROSCI.3669-14.2015

PMID:25698730

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4331622/

Abstract

摘要

在指导他人时的替代性强化学习信号。

Vicarious reinforcement learning signals when instructing others.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

在指导他人时的替代性强化学习信号。

Vicarious reinforcement learning signals when instructing others.

作者信息

机构信息

出版信息