Suppr超能文献

囚徒困境中随机学习自动机的进化策略

Evolutionary strategies of stochastic learning automata in the prisoner's dilemma.

作者信息

Billard E A

机构信息

Faculty of Computer Science and Engineering, University of Aizu, Fukushima, Japan.

出版信息

Biosystems. 1996;39(2):93-107. doi: 10.1016/0303-2647(96)01604-8.

Abstract

Stochastic learning automata (SLA) model stimulus-response species which receive feedback from the environment and adjust their mixed strategies in a Prisoner's Dilemma. A large heterogeneous population consists of SLA applying different strategies (i.e. different learning parameters) and other players applying deterministic strategies, Tit-For-Tat (TFT) or Always-Defect (ALLD). The predicted equilibria determine the payoffs within a generation for applying particular strategies and these equilibria are confirmed by simulation. The resultant population dynamics over many generations show that SLA with insensitive penalty responses strongly favor defection and dominate in subsequent generations over SLA with sensitive penalty responses. The SLA strategies are not evolutionarily stable as they can be invaded by TFT or ALLD. With the introduction of memory in the stimulus-response model, SLA learn to cooperate with TFT players.

摘要

随机学习自动机(SLA)对刺激 - 反应类型进行建模,这些类型从环境中接收反馈,并在囚徒困境中调整其混合策略。一个由大量不同个体组成的群体,其中包括采用不同策略(即不同学习参数)的SLA以及采用确定性策略的其他参与者,如针锋相对(TFT)或总是背叛(ALLD)。预测的均衡决定了应用特定策略时一代内的收益,并且这些均衡通过模拟得到了证实。多代产生的总体动态表明,具有不敏感惩罚反应的SLA强烈倾向于背叛,并且在后代中比具有敏感惩罚反应的SLA占主导地位。SLA策略在进化上并不稳定,因为它们可能会被TFT或ALLD入侵。随着在刺激 - 反应模型中引入记忆,SLA学会了与TFT参与者合作。

相似文献

2
Evolutionary cycles of cooperation and defection.合作与背叛的进化循环。
Proc Natl Acad Sci U S A. 2005 Aug 2;102(31):10797-800. doi: 10.1073/pnas.0502589102. Epub 2005 Jul 25.
5
Tit-for-tat or win-stay, lose-shift?以牙还牙还是赢则继续,输则改变?
J Theor Biol. 2007 Aug 7;247(3):574-80. doi: 10.1016/j.jtbi.2007.03.027. Epub 2007 Mar 24.
7
The art of war: beyond memory-one strategies in population games.战争的艺术:超越记忆——群体博弈中的一种策略
PLoS One. 2015 Mar 24;10(3):e0120625. doi: 10.1371/journal.pone.0120625. eCollection 2015.
8
Iterated Prisoner's Dilemma: pay-off variance.重复囚徒困境:收益方差
J Theor Biol. 1997 Sep 7;188(1):1-10. doi: 10.1006/jtbi.1997.0439.
10
Contingencies of reinforcement in a five-person prisoner's dilemma.五人囚徒困境中的强化偶然性
J Exp Anal Behav. 2004 Sep;82(2):161-76. doi: 10.1901/jeab.2004.82-161.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验