用于多智能体系统中学习动态的耦合复制方程。

Coupled replicator equations for the dynamics of learning in multiagent systems.

作者信息

Sato Yuzuru, Crutchfield James P

机构信息

Brain Science Institute, Institute of Physical and Chemical Research (RIKEN), 2-1 Hirosawa, Saitama 351-0198, Japan.

出版信息

Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Jan;67(1 Pt 2):015206. doi: 10.1103/PhysRevE.67.015206. Epub 2003 Jan 31.

DOI:10.1103/PhysRevE.67.015206

PMID:12636552

Abstract

Starting with a group of reinforcement-learning agents we derive coupled replicator equations that describe the dynamics of collective learning in multiagent systems. We show that, although agents model their environment in a self-interested way without sharing knowledge, a game dynamics emerges naturally through environment-mediated interactions. An application to rock-scissors-paper game interactions shows that the collective learning dynamics exhibits a diversity of competitive and cooperative behaviors. These include quasiperiodicity, stable limit cycles, intermittency, and deterministic chaos-behaviors that should be expected in heterogeneous multiagent systems described by the general replicator equations we derive.

摘要

从一组强化学习智能体出发，我们推导出了耦合复制方程，该方程描述了多智能体系统中集体学习的动态过程。我们表明，尽管智能体以自利的方式对其环境进行建模而不共享知识，但通过环境介导的相互作用，一种博弈动态会自然出现。对石头剪刀布游戏交互的应用表明，集体学习动态呈现出多种竞争与合作行为。这些行为包括准周期性、稳定极限环、间歇性以及确定性混沌——这些行为在我们推导的一般复制方程所描述的异构多智能体系统中是可以预期的。

相似文献

Coupled replicator equations for the dynamics of learning in multiagent systems.

Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Jan;67(1 Pt 2):015206. doi: 10.1103/PhysRevE.67.015206. Epub 2003 Jan 31.

Deterministic limit of temporal difference reinforcement learning for stochastic games.

Phys Rev E. 2019 Apr;99(4-1):043305. doi: 10.1103/PhysRevE.99.043305.

Replicator-mutator dynamics of the rock-paper-scissors game: Learning through mistakes.

Phys Rev E. 2024 Mar;109(3-1):034404. doi: 10.1103/PhysRevE.109.034404.

Intrinsic noise in game dynamical learning.

Phys Rev Lett. 2009 Nov 6;103(19):198702. doi: 10.1103/PhysRevLett.103.198702.

Nonlinear dynamics of the rock-paper-scissors game with mutations.

Phys Rev E Stat Nonlin Soft Matter Phys. 2015 May;91(5):052907. doi: 10.1103/PhysRevE.91.052907. Epub 2015 May 11.

Coevolutionary networks of reinforcement-learning agents.

Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jul;88(1):012815. doi: 10.1103/PhysRevE.88.012815. Epub 2013 Jul 24.

Data-Based Optimal Control of Multiagent Systems: A Reinforcement Learning Design Approach.

IEEE Trans Cybern. 2019 Dec;49(12):4441-4449. doi: 10.1109/TCYB.2018.2868715. Epub 2018 Sep 26.

Chaos in learning a simple two-person game.

Proc Natl Acad Sci U S A. 2002 Apr 2;99(7):4748-51. doi: 10.1073/pnas.032086299.

Model learning and knowledge sharing for a multiagent system with Dyna-Q learning.

IEEE Trans Cybern. 2015 May;45(5):964-76. doi: 10.1109/TCYB.2014.2341582. Epub 2014 Aug 5.

Multiagent Learning of Coordination in Loosely Coupled Multiagent Systems.

IEEE Trans Cybern. 2015 Dec;45(12):2853-67. doi: 10.1109/TCYB.2014.2387277. Epub 2015 Jan 13.

引用本文的文献

Heterogeneity, reinforcement learning, and chaos in population games.

Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2319929121. doi: 10.1073/pnas.2319929121. Epub 2025 Jun 16.

Collective cooperative intelligence.

Proc Natl Acad Sci U S A. 2025 Jun 24;122(25):e2319948121. doi: 10.1073/pnas.2319948121. Epub 2025 Jun 16.

How social reinforcement learning can lead to metastable polarisation and the voter model.

PLoS One. 2024 Dec 17;19(12):e0313951. doi: 10.1371/journal.pone.0313951. eCollection 2024.

Limits on the evolutionary rates of biological traits.

Sci Rep. 2024 May 17;14(1):11314. doi: 10.1038/s41598-024-61872-z.

Modelling Spirals of Silence and Echo Chambers by Learning from the Feedback of Others.

Entropy (Basel). 2022 Oct 18;24(10):1484. doi: 10.3390/e24101484.

Dynamical systems as a level of cognitive analysis of multi-agent learning: Algorithmic foundations of temporal-difference learning dynamics.

Neural Comput Appl. 2022;34(3):1653-1671. doi: 10.1007/s00521-021-06117-0. Epub 2021 Jun 23.

Evolution of heterogeneous perceptual limits and indifference in competitive foraging.

PLoS Comput Biol. 2021 Feb 23;17(2):e1008734. doi: 10.1371/journal.pcbi.1008734. eCollection 2021 Feb.

Conditional rehabilitation of cooperation under strategic uncertainty.

J Math Biol. 2019 Oct;79(5):1973-2003. doi: 10.1007/s00285-019-01417-5. Epub 2019 Aug 29.

Dynamical selection of Nash equilibria using reinforcement learning: Emergence of heterogeneous mixed equilibria.

PLoS One. 2018 Jul 9;13(7):e0196577. doi: 10.1371/journal.pone.0196577. eCollection 2018.

The prevalence of chaotic dynamics in games with many players.

Sci Rep. 2018 Mar 20;8(1):4902. doi: 10.1038/s41598-018-22013-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于多智能体系统中学习动态的耦合复制方程。

Coupled replicator equations for the dynamics of learning in multiagent systems.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献