• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重复博弈中自私学习的进化不稳定性

Evolutionary instability of selfish learning in repeated games.

作者信息

McAvoy Alex, Kates-Harbeck Julian, Chatterjee Krishnendu, Hilbe Christian

机构信息

Department of Mathematics, University of Pennsylvania, Philadelphia, PA, USA.

Center for Mathematical Biology, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

PNAS Nexus. 2022 Jul 27;1(4):pgac141. doi: 10.1093/pnasnexus/pgac141. eCollection 2022 Sep.

DOI:10.1093/pnasnexus/pgac141
PMID:36714856
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9802390/
Abstract

Across many domains of interaction, both natural and artificial, individuals use past experience to shape future behaviors. The results of such learning processes depend on what individuals wish to maximize. A natural objective is one's own success. However, when two such "selfish" learners interact with each other, the outcome can be detrimental to both, especially when there are conflicts of interest. Here, we explore how a learner can align incentives with a selfish opponent. Moreover, we consider the dynamics that arise when learning rules themselves are subject to evolutionary pressure. By combining extensive simulations and analytical techniques, we demonstrate that selfish learning is unstable in most classical two-player repeated games. If evolution operates on the level of long-run payoffs, selection instead favors learning rules that incorporate social (other-regarding) preferences. To further corroborate these results, we analyze data from a repeated prisoner's dilemma experiment. We find that selfish learning is insufficient to explain human behavior when there is a trade-off between payoff maximization and fairness.

摘要

在许多自然和人工的交互领域中,个体利用过去的经验来塑造未来的行为。这种学习过程的结果取决于个体想要最大化的东西。一个自然的目标是自身的成功。然而,当两个这样的“自私”学习者相互作用时,结果可能对双方都不利,尤其是当存在利益冲突时。在这里,我们探讨学习者如何与自私的对手调整激励机制。此外,我们考虑当学习规则本身受到进化压力时所产生的动态变化。通过结合广泛的模拟和分析技术,我们证明在大多数经典的两人重复博弈中,自私学习是不稳定的。如果进化作用于长期收益水平,那么选择反而有利于纳入社会(他人导向)偏好的学习规则。为了进一步证实这些结果,我们分析了来自重复囚徒困境实验的数据。我们发现,当在收益最大化和公平之间进行权衡时,自私学习不足以解释人类行为。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/62914cf6f95f/pgac141fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/ea6e42040611/pgac141fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/7809168b11de/pgac141fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/29d25b3a2235/pgac141fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/1500d6aea220/pgac141fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/62914cf6f95f/pgac141fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/ea6e42040611/pgac141fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/7809168b11de/pgac141fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/29d25b3a2235/pgac141fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/1500d6aea220/pgac141fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e3bb/9802390/62914cf6f95f/pgac141fig5.jpg

相似文献

1
Evolutionary instability of selfish learning in repeated games.重复博弈中自私学习的进化不稳定性
PNAS Nexus. 2022 Jul 27;1(4):pgac141. doi: 10.1093/pnasnexus/pgac141. eCollection 2022 Sep.
2
Payoff landscapes and the robustness of selfish optimization in iterated games.迭代博弈中的收益景观和自利优化的稳健性。
J Math Biol. 2022 May 12;84(6):55. doi: 10.1007/s00285-022-01758-8.
3
Individual variation evades the prisoner's dilemma.个体差异规避了囚徒困境。
BMC Evol Biol. 2002 Sep 10;2:15. doi: 10.1186/1471-2148-2-15.
4
Adaptive dynamics of extortion and compliance.敲诈与服从的适应动态。
PLoS One. 2013 Nov 1;8(11):e77886. doi: 10.1371/journal.pone.0077886. eCollection 2013.
5
Multiagent reinforcement learning in the Iterated Prisoner's Dilemma.重复囚徒困境中的多智能体强化学习
Biosystems. 1996;37(1-2):147-66. doi: 10.1016/0303-2647(95)01551-5.
6
Win-stay-lose-learn promotes cooperation in the spatial prisoner's dilemma game.胜留败走促进空间囚徒困境博弈中的合作。
PLoS One. 2012;7(2):e30689. doi: 10.1371/journal.pone.0030689. Epub 2012 Feb 17.
7
Evolution of extortion in Iterated Prisoner's Dilemma games.重复囚徒困境博弈中的敲诈勒索行为的演变。
Proc Natl Acad Sci U S A. 2013 Apr 23;110(17):6913-8. doi: 10.1073/pnas.1214834110. Epub 2013 Apr 9.
8
Adapting paths against zero-determinant strategies in repeated prisoner's dilemma games.在重复囚徒困境博弈中适应零行列式策略的路径。
J Theor Biol. 2022 Sep 21;549:111211. doi: 10.1016/j.jtbi.2022.111211. Epub 2022 Jul 8.
9
Evolutionary games and population dynamics: maintenance of cooperation in public goods games.进化博弈与种群动态:公共物品博弈中合作的维持
Proc Biol Sci. 2006 Oct 7;273(1600):2565-70. doi: 10.1098/rspb.2006.3600.
10
Transfer of conflict and cooperation from experienced games to new games: a connectionist model of learning.冲突与合作从经验性游戏到新游戏的迁移:一种学习的联结主义模型
Front Neurosci. 2015 Mar 31;9:102. doi: 10.3389/fnins.2015.00102. eCollection 2015.

引用本文的文献

1
Unilateral incentive alignment in two-agent stochastic games.双智能体随机博弈中的单边激励对齐
Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2319927121. doi: 10.1073/pnas.2319927121. Epub 2025 Jun 16.
2
Conditional cooperation with longer memory.具有更长记忆的条件性合作。
Proc Natl Acad Sci U S A. 2024 Dec 10;121(50):e2420125121. doi: 10.1073/pnas.2420125121. Epub 2024 Dec 6.
3
Resolving selfish and spiteful interdependent conflict.解决自私和恶意相互依存的冲突。

本文引用的文献

1
Payoff-based learning best explains the rate of decline in cooperation across 237 public-goods games.基于回报的学习最能解释 237 个公共物品博弈中合作率下降的原因。
Nat Hum Behav. 2021 Oct;5(10):1330-1338. doi: 10.1038/s41562-021-01107-7. Epub 2021 May 3.
2
Social goods dilemmas in heterogeneous societies.异质社会中的社会商品困境。
Nat Hum Behav. 2020 Aug;4(8):819-831. doi: 10.1038/s41562-020-0881-2. Epub 2020 May 25.
3
Social dilemmas among unequals.不平等者之间的社会困境。
Proc Biol Sci. 2024 Apr 10;291(2020):20240295. doi: 10.1098/rspb.2024.0295.
4
Adaptive dynamics of memory-one strategies in the repeated donation game.记忆策略在重复捐赠游戏中的适应动态。
PLoS Comput Biol. 2023 Jun 29;19(6):e1010987. doi: 10.1371/journal.pcbi.1010987. eCollection 2023 Jun.
5
Grouping promotes both partnership and rivalry with long memory in direct reciprocity.分组促进了直接互惠中的伙伴关系和竞争,同时具有长记忆。
PLoS Comput Biol. 2023 Jun 20;19(6):e1011228. doi: 10.1371/journal.pcbi.1011228. eCollection 2023 Jun.
6
Direct reciprocity between individuals that use different strategy spaces.个体之间使用不同策略空间的直接互惠。
PLoS Comput Biol. 2022 Jun 14;18(6):e1010149. doi: 10.1371/journal.pcbi.1010149. eCollection 2022 Jun.
Nature. 2019 Aug;572(7770):524-527. doi: 10.1038/s41586-019-1488-5. Epub 2019 Aug 15.
4
Social evolution leads to persistent corruption.社会进化导致持续腐败。
Proc Natl Acad Sci U S A. 2019 Jul 2;116(27):13276-13281. doi: 10.1073/pnas.1900078116. Epub 2019 Jun 13.
5
Partners and rivals in direct reciprocity.直接互惠的伙伴和对手。
Nat Hum Behav. 2018 Jul;2(7):469-477. doi: 10.1038/s41562-018-0320-9. Epub 2018 Mar 19.
6
Reactive learning strategies for iterated games.迭代博弈的反应式学习策略。
Proc Math Phys Eng Sci. 2019 Mar;475(2223):20180819. doi: 10.1098/rspa.2018.0819. Epub 2019 Mar 20.
7
Heterogeneous update mechanisms in evolutionary games: Mixing innovative and imitative dynamics.进化博弈中的异质更新机制:混合创新和模仿动力学。
Phys Rev E. 2018 Apr;97(4-1):042305. doi: 10.1103/PhysRevE.97.042305.
8
Social norm complexity and past reputations in the evolution of cooperation.社会规范复杂性和过去声誉在合作进化中的作用。
Nature. 2018 Mar 7;555(7695):242-245. doi: 10.1038/nature25763.
9
Genome-driven evolutionary game theory helps understand the rise of metabolic interdependencies in microbial communities.基因组驱动的进化博弈论有助于理解微生物群落中代谢相互依存关系的兴起。
Nat Commun. 2017 Nov 16;8(1):1563. doi: 10.1038/s41467-017-01407-5.
10
Evolutionary dynamics of group formation.群体形成的进化动力学。
PLoS One. 2017 Nov 14;12(11):e0187960. doi: 10.1371/journal.pone.0187960. eCollection 2017.