• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人们通过奖惩进行教学,而不是通过强化物进行沟通。

People teach with rewards and punishments as communication, not reinforcements.

机构信息

Department of Cognitive, Linguistics, and Psychological Sciences, Brown University.

Department of Psychology, Harvard University.

出版信息

J Exp Psychol Gen. 2019 Mar;148(3):520-549. doi: 10.1037/xge0000569.

DOI:10.1037/xge0000569
PMID:30802127
Abstract

Carrots and sticks motivate behavior, and people can teach new behaviors to other organisms, such as children or nonhuman animals, by tapping into their reward learning mechanisms. But how people teach with reward and punishment depends on their expectations about the learner. We examine how people teach using reward and punishment by contrasting two hypotheses. The first is evaluative feedback as reinforcement, where rewards and punishments are used to shape learner behavior through reinforcement learning mechanisms. The second is evaluative feedback as communication, where rewards and punishments are used to signal target behavior to a learning agent reasoning about a teacher's pedagogical goals. We present formalizations of learning from these 2 teaching strategies based on computational frameworks for reinforcement learning. Our analysis based on these models motivates a simple interactive teaching paradigm that distinguishes between the two teaching hypotheses. Across 3 sets of experiments, we find that people are strongly biased to use evaluative feedback communicatively rather than as reinforcement. (PsycINFO Database Record (c) 2019 APA, all rights reserved).

摘要

胡萝卜加大棒能激励行为,人们可以通过利用奖励学习机制来教导其他生物,如儿童或非人类动物新的行为。但是,人们如何通过奖惩来教导取决于他们对学习者的期望。我们通过对比两种假设来考察人们如何通过奖励和惩罚来教学。第一种是作为强化的评价性反馈,其中奖励和惩罚被用来通过强化学习机制来塑造学习者的行为。第二种是作为沟通的评价性反馈,其中奖励和惩罚被用来向一个对教师的教学目标进行推理的学习代理发出目标行为的信号。我们根据强化学习的计算框架,对这两种教学策略的学习进行了形式化。我们的分析基于这些模型,提出了一种简单的互动教学范式,将这两种教学假说区分开来。在 3 组实验中,我们发现人们强烈倾向于将评价性反馈用于沟通,而不是强化。(APA,所有权利保留)。

相似文献

1
People teach with rewards and punishments as communication, not reinforcements.人们通过奖惩进行教学,而不是通过强化物进行沟通。
J Exp Psychol Gen. 2019 Mar;148(3):520-549. doi: 10.1037/xge0000569.
2
Social is special: A normative framework for teaching with and learning from evaluative feedback.社交具有特殊性:一个关于利用评价性反馈进行教学和从中学习的规范框架。
Cognition. 2017 Oct;167:91-106. doi: 10.1016/j.cognition.2017.03.006. Epub 2017 Mar 22.
3
Modular deep reinforcement learning from reward and punishment for robot navigation.基于奖惩的机器人导航模块化深度强化学习。
Neural Netw. 2021 Mar;135:115-126. doi: 10.1016/j.neunet.2020.12.001. Epub 2020 Dec 8.
4
Reward and punishment learning deficits among bipolar disorder subtypes.双相障碍亚型的奖惩学习缺陷。
J Affect Disord. 2023 Nov 1;340:694-702. doi: 10.1016/j.jad.2023.08.075. Epub 2023 Aug 15.
5
Punishment is Organized around Principles of Communicative Inference.惩罚是围绕着交际推理原则来组织的。
Cognition. 2021 Mar;208:104544. doi: 10.1016/j.cognition.2020.104544. Epub 2020 Dec 28.
6
A Neurocomputational Account of How Inflammation Enhances Sensitivity to Punishments Versus Rewards.关于炎症如何增强对惩罚与奖励敏感性的神经计算解释。
Biol Psychiatry. 2016 Jul 1;80(1):73-81. doi: 10.1016/j.biopsych.2015.07.018. Epub 2015 Aug 1.
7
Winners and losers: Reward and punishment produce biases in temporal selection.赢家与输家:奖励与惩罚在时间选择上产生偏差。
J Exp Psychol Learn Mem Cogn. 2019 May;45(5):822-833. doi: 10.1037/xlm0000612. Epub 2018 Jul 9.
8
Decision-making patterns and sensitivity to reward and punishment in children with attention-deficit hyperactivity disorder.注意缺陷多动障碍儿童的决策模式及对奖惩的敏感性
Int J Psychophysiol. 2009 Jun;72(3):283-8. doi: 10.1016/j.ijpsycho.2009.01.007.
9
Dual-task performance is differentially modulated by rewards and punishments.双重任务表现受奖励和惩罚的影响而不同。
Behav Brain Res. 2013 Aug 1;250:304-7. doi: 10.1016/j.bbr.2013.05.010. Epub 2013 May 13.
10
Confirmatory reinforcement learning changes with age during adolescence.确认性强化学习在青少年时期随年龄变化。
Dev Sci. 2023 May;26(3):e13330. doi: 10.1111/desc.13330. Epub 2022 Oct 27.

引用本文的文献

1
What people learn from punishment: A cognitive model.人们从惩罚中学到了什么:一种认知模型。
Proc Natl Acad Sci U S A. 2025 Aug 12;122(32):e2500730122. doi: 10.1073/pnas.2500730122. Epub 2025 Aug 4.
2
The influence of social feedback on reward learning in the Iowa gambling task.社会反馈对爱荷华赌博任务中奖励学习的影响。
Front Psychol. 2024 May 2;15:1292808. doi: 10.3389/fpsyg.2024.1292808. eCollection 2024.
3
Punishment: one tool, many uses.惩罚:一种工具,多种用途。
Evol Hum Sci. 2019 Nov 12;1:e12. doi: 10.1017/ehs.2019.12. eCollection 2019.
4
The cultural evolution of teaching.教学的文化演变
Evol Hum Sci. 2023 May 12;5:e14. doi: 10.1017/ehs.2023.14. eCollection 2023.
5
Learning from other minds: An optimistic critique of reinforcement learning models of social learning.向他人学习:对社会学习强化学习模型的乐观批判。
Curr Opin Behav Sci. 2021 Apr;38:110-115. doi: 10.1016/j.cobeha.2021.01.006. Epub 2021 Mar 23.
6
Human Representation Learning.人类表示学习。
Annu Rev Neurosci. 2021 Jul 8;44:253-273. doi: 10.1146/annurev-neuro-092920-120559. Epub 2021 Mar 17.