通过最优控制在囚徒困境进化博弈中实现合作最大化。

Maximizing cooperation in the prisoner's dilemma evolutionary game via optimal control.

作者信息

Newton P K, Ma Y

机构信息

Department of Aerospace & Mechanical Engineering, Mathematics, and The Ellison Institute, University of Southern California, Los Angeles, California 90089-1191, USA.

Department of Physics & Astronomy, University of Southern California, Los Angeles, California 90089-1191, USA.

出版信息

Phys Rev E. 2021 Jan;103(1-1):012304. doi: 10.1103/PhysRevE.103.012304.

DOI:10.1103/PhysRevE.103.012304

PMID:33601552

Abstract

The prisoner's dilemma (PD) game offers a simple paradigm of competition between two players who can either cooperate or defect. Since defection is a strict Nash equilibrium, it is an asymptotically stable state of the replicator dynamical system that uses the PD payoff matrix to define the fitness landscape of two interacting evolving populations. The dilemma arises from the fact that the average payoff of this asymptotically stable state is suboptimal. Coaxing the players to cooperate would result in a higher payoff for both. Here we develop an optimal control theory for the prisoner's dilemma evolutionary game in order to maximize cooperation (minimize the defector population) over a given cycle time T, subject to constraints. Our two time-dependent controllers are applied to the off-diagonal elements of the payoff matrix in a bang-bang sequence that dynamically changes the game being played by dynamically adjusting the payoffs, with optimal timing that depends on the initial population distributions. Over multiple cycles nT (n>1), the method is adaptive as it uses the defector population at the end of the nth cycle to calculate the optimal schedule over the n+1st cycle. The control method, based on Pontryagin's maximum principle, can be viewed as determining the optimal way to dynamically alter incentives and penalties in order to maximize the probability of cooperation in settings that track dynamic changes in the frequency of strategists, with potential applications in evolutionary biology, economics, theoretical ecology, social sciences, reinforcement learning, and other fields where the replicator system is used.

摘要

囚徒困境（PD）博弈提供了一个简单的范式，用于描述两个参与者之间的竞争，他们可以选择合作或背叛。由于背叛是严格的纳什均衡，它是复制者动力系统的渐近稳定状态，该系统使用PD收益矩阵来定义两个相互作用的进化群体的适应度景观。困境源于这样一个事实，即这个渐近稳定状态的平均收益是次优的。诱使参与者合作会使双方获得更高的收益。在这里，我们为囚徒困境进化博弈开发了一种最优控制理论，以便在给定的循环时间T内，在满足约束条件的情况下，最大化合作（最小化背叛者群体）。我们的两个时间相关控制器以一种砰砰序列应用于收益矩阵的非对角元素，通过动态调整收益来动态改变正在进行的博弈，其最优时机取决于初始群体分布。在多个循环nT（n>1）中，该方法具有适应性，因为它使用第n个循环结束时的背叛者群体来计算第n+1个循环的最优调度。基于庞特里亚金极大值原理的控制方法，可以被视为确定动态改变激励和惩罚的最优方式，以便在跟踪策略者频率动态变化的环境中最大化合作的概率，在进化生物学、经济学、理论生态学、社会科学、强化学习以及其他使用复制者系统的领域具有潜在应用。

相似文献

Maximizing cooperation in the prisoner's dilemma evolutionary game via optimal control.通过最优控制在囚徒困境进化博弈中实现合作最大化。

Phys Rev E. 2021 Jan;103(1-1):012304. doi: 10.1103/PhysRevE.103.012304.

Transforming the dilemma.转变困境。

Evolution. 2007 Oct;61(10):2281-92. doi: 10.1111/j.1558-5646.2007.00196.x. Epub 2007 Aug 17.

Evolutionary game dynamics of cooperation in prisoner's dilemma with time delay.时滞囚徒困境中合作的进化博弈动力学。

Math Biosci Eng. 2023 Jan 5;20(3):5024-5042. doi: 10.3934/mbe.2023233.

Individual variation evades the prisoner's dilemma.个体差异规避了囚徒困境。

BMC Evol Biol. 2002 Sep 10;2:15. doi: 10.1186/1471-2148-2-15.

Impact of topology on the dynamical organization of cooperation in the prisoner's dilemma game.拓扑结构对囚徒困境博弈中合作动态组织的影响。

Phys Rev E Stat Nonlin Soft Matter Phys. 2008 Mar;77(3 Pt 2):036120. doi: 10.1103/PhysRevE.77.036120. Epub 2008 Mar 20.

Enhancement of Cooperation and Reentrant Phase of Prisoner's Dilemma Game on Signed Networks.带符号网络上囚徒困境博弈的合作增强与折返阶段

Entropy (Basel). 2022 Jan 18;24(2):144. doi: 10.3390/e24020144.

Emergence of super cooperation of prisoner's dilemma games on scale-free networks.无标度网络上囚徒困境博弈的超级合作现象

PLoS One. 2015 Feb 2;10(2):e0116429. doi: 10.1371/journal.pone.0116429. eCollection 2015.

Collapse of cooperation in evolving games.进化博弈中合作的瓦解。

Proc Natl Acad Sci U S A. 2014 Dec 9;111(49):17558-63. doi: 10.1073/pnas.1408618111. Epub 2014 Nov 24.

Evolutionary cycles of cooperation and defection.合作与背叛的进化循环。

Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):10797-800. doi: 10.1073/pnas.0502589102. Epub 2005 Jul 25.

Stochastic dynamics of the prisoner's dilemma with cooperation facilitators.具有合作促进因素的囚徒困境的随机动力学

Phys Rev E Stat Nonlin Soft Matter Phys. 2012 Jul;86(1 Pt 1):011134. doi: 10.1103/PhysRevE.86.011134. Epub 2012 Jul 30.

引用本文的文献

Deciphering population-level response under spatial drug heterogeneity on microhabitat structures.解析微生境结构上空间药物异质性下的群体水平反应。

bioRxiv. 2025 Mar 2:2025.02.13.638200. doi: 10.1101/2025.02.13.638200.

COVID-19 vaccine incentive scheduling using an optimally controlled reinforcement learning model.使用最优控制强化学习模型进行新冠病毒病2019疫苗激励计划安排

Physica D. 2023 Mar;445:133613. doi: 10.1016/j.physd.2022.133613. Epub 2022 Dec 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过最优控制在囚徒困境进化博弈中实现合作最大化。

Maximizing cooperation in the prisoner's dilemma evolutionary game via optimal control.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献