Cardiff University, School of Mathematics, Cardiff, United Kingdom.
Google Inc., Mountain View, CA, United States of America.
PLoS One. 2018 Oct 25;13(10):e0204981. doi: 10.1371/journal.pone.0204981. eCollection 2018.
We present insights and empirical results from an extensive numerical study of the evolutionary dynamics of the iterated prisoner's dilemma. Fixation probabilities for Moran processes are obtained for all pairs of 164 different strategies including classics such as TitForTat, zero determinant strategies, and many more sophisticated strategies. Players with long memories and sophisticated behaviours outperform many strategies that perform well in a two player setting. Moreover we introduce several strategies trained with evolutionary algorithms to excel at the Moran process. These strategies are excellent invaders and resistors of invasion and in some cases naturally evolve handshaking mechanisms to resist invasion. The best invaders were those trained to maximize total payoff while the best resistors invoke handshake mechanisms. This suggests that while maximizing individual payoff can lead to the evolution of cooperation through invasion, the relatively weak invasion resistance of payoff maximizing strategies are not as evolutionarily stable as strategies employing handshake mechanisms.
我们提出了从广泛的迭代囚徒困境进化动力学数值研究中得到的见解和实证结果。获得了包括TitForTat、零行列式策略等经典策略在内的 164 种不同策略的所有配对的 Moran 过程固定概率。具有长期记忆和复杂行为的玩家表现优于许多在双玩家环境中表现良好的策略。此外,我们引入了几种通过进化算法训练的策略,使其在 Moran 过程中表现出色。这些策略是优秀的侵略者和抗侵略者,在某些情况下,它们会自然进化出握手机制来抵抗入侵。最好的侵略者是那些被训练来最大化总收益的策略,而最好的抵抗者则采用握手机制。这表明,虽然最大化个体收益可以通过入侵导致合作的进化,但收益最大化策略的相对较弱的入侵抗性并不像采用握手机制的策略那样具有进化稳定性。