Tang Wei, Wang Guoling, Xing Zhiyan
School of Information Engineering, Guizhou Open University, Guiyang, Guizhou, 550023, China.
School of Mathematics and Statistics, Guizhou University, Guiyang, Guizhou, 550025, China.
Sci Rep. 2025 Jul 18;15(1):26136. doi: 10.1038/s41598-025-11557-y.
Promoting cooperation remains a major challenge in natural science. While most studies focus on single strategy update rules, individuals in real-life often use multiple strategies in response to dynamic environments. This paper introduces a mixed update rule combining imitation and reinforcement learning (RL). In imitation learning (IL), individuals adopt strategies from higher-payoff opponents, while RL relies on personal experience. Simulations of the Prisoner's Dilemma Game (PDG), Coexistence Game (CG), and Coordination Game (CoG), both in well-mixed populations and square lattice networks, show that: (i) cooperation and defection coexist in the PDG, resolving the dilemma of universal defection; (ii) cooperation exceeds the mixed Nash equilibrium in the CG; and (iii) cooperators dominate in the CoG. The mixed update rule outperforms single strategy approaches in those games, highlighting its effectiveness in fostering cooperation.
促进合作仍然是自然科学中的一项重大挑战。虽然大多数研究集中在单一策略更新规则上,但现实生活中的个体通常会采用多种策略来应对动态环境。本文介绍了一种结合模仿学习和强化学习(RL)的混合更新规则。在模仿学习(IL)中,个体采用来自高收益对手的策略,而强化学习则依赖于个人经验。在完全混合的群体和方形晶格网络中对囚徒困境博弈(PDG)、共存博弈(CG)和协调博弈(CoG)进行的模拟表明:(i)在囚徒困境博弈中合作与背叛共存,解决了普遍背叛的困境;(ii)在共存博弈中合作超过了混合纳什均衡;(iii)在协调博弈中合作者占主导地位。在这些博弈中,混合更新规则优于单一策略方法,突出了其在促进合作方面的有效性。