School of Mathematics, Cardiff University, Cardiff, United Kingdom.
Google Inc., Mountain View, CA, United States of America.
PLoS One. 2024 Jul 26;19(7):e0304641. doi: 10.1371/journal.pone.0304641. eCollection 2024.
Establishing and maintaining mutual cooperation in agent-to-agent interactions can be viewed as a question of direct reciprocity and readily applied to the Iterated Prisoner's Dilemma. Agents cooperate, at a small cost to themselves, in the hope of obtaining a future benefit. Zero-determinant strategies, introduced in 2012, have a subclass of strategies that are provably extortionate. In the established literature, most of the studies of the effectiveness or lack thereof, of zero-determinant strategies is done by placing some zero-determinant strategy in a specific scenario (collection of agents) and evaluating its performance either numerically or theoretically. Extortionate strategies are algebraically rigid and memory-one by definition, and requires complete knowledge of a strategy (the memory-one cooperation probabilities). The contribution of this work is a method to detect extortionate behaviour from the history of play of an arbitrary strategy. This inverts the paradigm of most studies: instead of observing the effectiveness of some theoretically extortionate strategies, the largest known collection of strategies will be observed and their intensity of extortion quantified empirically. Moreover, we show that the lack of adaptability of extortionate strategies extends via this broader definition.
在代理间交互中建立和维护相互合作,可以被视为直接互惠的问题,并可以很容易地应用于重复囚徒困境。代理以较小的自身成本进行合作,希望获得未来的利益。2012 年引入的零行列式策略,有一个策略子类是可证明的敲诈勒索的。在已建立的文献中,对零行列式策略的有效性或缺乏有效性的大部分研究,都是通过在特定场景(代理集合)中放置一些零行列式策略,并通过数值或理论评估其性能来完成的。敲诈勒索策略从定义上讲是代数刚性和记忆一的,并且需要完全了解策略(记忆一合作概率)。这项工作的贡献是一种从任意策略的历史记录中检测敲诈勒索行为的方法。这颠覆了大多数研究的范例:不是观察某些理论上敲诈勒索策略的有效性,而是观察最大的已知策略集合,并从经验上量化它们的敲诈勒索强度。此外,我们还表明,敲诈勒索策略的缺乏适应性通过这个更广泛的定义扩展了。