• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

由模仿学习与强化学习共存引导的合作进化

Evolution of cooperation guided by the coexistence of imitation learning and reinforcement learning.

作者信息

Tang Wei, Wang Guoling, Xing Zhiyan

机构信息

School of Information Engineering, Guizhou Open University, Guiyang, Guizhou, 550023, China.

School of Mathematics and Statistics, Guizhou University, Guiyang, Guizhou, 550025, China.

出版信息

Sci Rep. 2025 Jul 18;15(1):26136. doi: 10.1038/s41598-025-11557-y.

DOI:10.1038/s41598-025-11557-y
PMID:40681680
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12274395/
Abstract

Promoting cooperation remains a major challenge in natural science. While most studies focus on single strategy update rules, individuals in real-life often use multiple strategies in response to dynamic environments. This paper introduces a mixed update rule combining imitation and reinforcement learning (RL). In imitation learning (IL), individuals adopt strategies from higher-payoff opponents, while RL relies on personal experience. Simulations of the Prisoner's Dilemma Game (PDG), Coexistence Game (CG), and Coordination Game (CoG), both in well-mixed populations and square lattice networks, show that: (i) cooperation and defection coexist in the PDG, resolving the dilemma of universal defection; (ii) cooperation exceeds the mixed Nash equilibrium in the CG; and (iii) cooperators dominate in the CoG. The mixed update rule outperforms single strategy approaches in those games, highlighting its effectiveness in fostering cooperation.

摘要

促进合作仍然是自然科学中的一项重大挑战。虽然大多数研究集中在单一策略更新规则上,但现实生活中的个体通常会采用多种策略来应对动态环境。本文介绍了一种结合模仿学习和强化学习(RL)的混合更新规则。在模仿学习(IL)中,个体采用来自高收益对手的策略,而强化学习则依赖于个人经验。在完全混合的群体和方形晶格网络中对囚徒困境博弈(PDG)、共存博弈(CG)和协调博弈(CoG)进行的模拟表明:(i)在囚徒困境博弈中合作与背叛共存,解决了普遍背叛的困境;(ii)在共存博弈中合作超过了混合纳什均衡;(iii)在协调博弈中合作者占主导地位。在这些博弈中,混合更新规则优于单一策略方法,突出了其在促进合作方面的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/03fdf3ae04a5/41598_2025_11557_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/43ea028ef394/41598_2025_11557_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/047a8495d5f0/41598_2025_11557_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/63a91966af6b/41598_2025_11557_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/c58e0f9f51c4/41598_2025_11557_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/10bde823a93e/41598_2025_11557_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/45ac047a5866/41598_2025_11557_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/e5efe0918345/41598_2025_11557_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/00705e9674a9/41598_2025_11557_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/0a7afd552026/41598_2025_11557_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/e49a0bf061e8/41598_2025_11557_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/a5042119d13d/41598_2025_11557_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/3d5b1ec32dc1/41598_2025_11557_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/dfad37995af8/41598_2025_11557_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/03fdf3ae04a5/41598_2025_11557_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/43ea028ef394/41598_2025_11557_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/047a8495d5f0/41598_2025_11557_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/63a91966af6b/41598_2025_11557_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/c58e0f9f51c4/41598_2025_11557_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/10bde823a93e/41598_2025_11557_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/45ac047a5866/41598_2025_11557_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/e5efe0918345/41598_2025_11557_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/00705e9674a9/41598_2025_11557_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/0a7afd552026/41598_2025_11557_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/e49a0bf061e8/41598_2025_11557_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/a5042119d13d/41598_2025_11557_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/3d5b1ec32dc1/41598_2025_11557_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/dfad37995af8/41598_2025_11557_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/427a/12274395/03fdf3ae04a5/41598_2025_11557_Fig14_HTML.jpg

相似文献

1
Evolution of cooperation guided by the coexistence of imitation learning and reinforcement learning.由模仿学习与强化学习共存引导的合作进化
Sci Rep. 2025 Jul 18;15(1):26136. doi: 10.1038/s41598-025-11557-y.
2
Cooperative bots exhibit nuanced effects on cooperation across strategic frameworks.合作型机器人在不同战略框架下对合作呈现出细微的影响。
J R Soc Interface. 2025 Jan;22(222):20240427. doi: 10.1098/rsif.2024.0427. Epub 2025 Jan 29.
3
Promoting cooperation in the voluntary prisoner's dilemma game via reinforcement learning.通过强化学习促进自愿囚徒困境博弈中的合作
Chaos. 2025 Apr 1;35(4). doi: 10.1063/5.0267846.
4
Evolving general cooperation with a Bayesian theory of mind.与贝叶斯心理理论不断发展的一般合作。
Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2400993122. doi: 10.1073/pnas.2400993122. Epub 2025 Jun 16.
5
The impact of feedbacks on evolutionary game dynamics in structured populations.反馈对结构化种群中进化博弈动态的影响。
Chaos. 2025 Jun 1;35(6). doi: 10.1063/5.0278673.
6
Multi-games with voluntary participation on a dynamic network and the evolution of cooperation.具有动态网络上自愿参与的多博弈与合作的演化
Chaos. 2025 Jul 1;35(7). doi: 10.1063/5.0282462.
7
Emergence of cooperation in the one-shot Prisoner's dilemma through Discriminatory and Samaritan AIs.通过歧视性和利他性 AI,实现一次性囚徒困境中的合作。
J R Soc Interface. 2024 Sep;21(218):20240212. doi: 10.1098/rsif.2024.0212. Epub 2024 Sep 25.
8
Quantal response equilibrium for the Prisoner's Dilemma game in Markov strategies.马尔可夫策略下囚徒困境博弈的量子反应平衡。
Sci Rep. 2022 Mar 16;12(1):4482. doi: 10.1038/s41598-022-08426-3.
9
Evolutionary dynamics of cooperation driven by a mixed update rule in structured prisoner's dilemma games.结构化囚徒困境博弈中混合更新规则驱动的合作进化动力学
Chaos. 2025 Feb 1;35(2). doi: 10.1063/5.0245574.
10
Incorporating social payoff into reinforcement learning promotes cooperation.将社会回报纳入强化学习可促进合作。
Chaos. 2022 Dec;32(12):123140. doi: 10.1063/5.0093996.

本文引用的文献

1
Strategy evolution on dynamic networks.动态网络中的策略演变。
Nat Comput Sci. 2023 Sep;3(9):763-776. doi: 10.1038/s43588-023-00509-z. Epub 2023 Sep 11.
2
Evolution of cooperation in social dilemmas under the coexistence of aspiration and imitation mechanisms.在期望与模仿机制共存下社会困境中合作的演变
Phys Rev E. 2020 Sep;102(3-1):032120. doi: 10.1103/PhysRevE.102.032120.
3
Evolution of cooperation under punishment.惩罚下的合作演变。
Phys Rev E. 2020 Jun;101(6-1):062419. doi: 10.1103/PhysRevE.101.062419.
4
Evolutionary game dynamics of combining the imitation and aspiration-driven update rules.模仿和渴望驱动更新规则相结合的进化博弈动力学。
Phys Rev E. 2019 Aug;100(2-1):022411. doi: 10.1103/PhysRevE.100.022411.
5
Cooperation dynamics of generalized reciprocity in state-based social dilemmas.基于状态的社会困境中广义互惠的合作动态。
Phys Rev E. 2018 May;97(5-1):052305. doi: 10.1103/PhysRevE.97.052305.
6
Heterogeneous update mechanisms in evolutionary games: Mixing innovative and imitative dynamics.进化博弈中的异质更新机制:混合创新和模仿动力学。
Phys Rev E. 2018 Apr;97(4-1):042305. doi: 10.1103/PhysRevE.97.042305.
7
Stochastic win-stay-lose-shift strategy with dynamic aspirations in evolutionary social dilemmas.演化社会困境中的随机赢留输移策略与动态愿望
Phys Rev E. 2016 Sep;94(3-1):032317. doi: 10.1103/PhysRevE.94.032317. Epub 2016 Sep 29.
8
Aspiration promotes cooperation in the prisoner's dilemma game with the imitation rule.在具有模仿规则的囚徒困境博弈中,抱负促进合作。
Phys Rev E. 2016 Jul;94(1-1):012124. doi: 10.1103/PhysRevE.94.012124. Epub 2016 Jul 18.
9
Evolutionary mixed games in structured populations: Cooperation and the benefits of heterogeneity.结构种群中的进化混合博弈:合作与异质性的收益。
Phys Rev E. 2016 Apr;93:042304. doi: 10.1103/PhysRevE.93.042304. Epub 2016 Apr 6.
10
Aspiration dynamics in structured population acts as if in a well-mixed one.结构化种群中的抽吸动态表现得如同在充分混合的种群中一样。
Sci Rep. 2015 Jan 26;5:8014. doi: 10.1038/srep08014.