Sato Yuzuru, Crutchfield James P
Brain Science Institute, Institute of Physical and Chemical Research (RIKEN), 2-1 Hirosawa, Saitama 351-0198, Japan.
Phys Rev E Stat Nonlin Soft Matter Phys. 2003 Jan;67(1 Pt 2):015206. doi: 10.1103/PhysRevE.67.015206. Epub 2003 Jan 31.
Starting with a group of reinforcement-learning agents we derive coupled replicator equations that describe the dynamics of collective learning in multiagent systems. We show that, although agents model their environment in a self-interested way without sharing knowledge, a game dynamics emerges naturally through environment-mediated interactions. An application to rock-scissors-paper game interactions shows that the collective learning dynamics exhibits a diversity of competitive and cooperative behaviors. These include quasiperiodicity, stable limit cycles, intermittency, and deterministic chaos-behaviors that should be expected in heterogeneous multiagent systems described by the general replicator equations we derive.
从一组强化学习智能体出发,我们推导出了耦合复制方程,该方程描述了多智能体系统中集体学习的动态过程。我们表明,尽管智能体以自利的方式对其环境进行建模而不共享知识,但通过环境介导的相互作用,一种博弈动态会自然出现。对石头剪刀布游戏交互的应用表明,集体学习动态呈现出多种竞争与合作行为。这些行为包括准周期性、稳定极限环、间歇性以及确定性混沌——这些行为在我们推导的一般复制方程所描述的异构多智能体系统中是可以预期的。