两人博弈中的改进学习

Melioration Learning in Two-Person Games.

作者信息

Zschache Johannes

机构信息

Institute of Sociology, Leipzig University, Leipzig, Germany.

出版信息

PLoS One. 2016 Nov 16;11(11):e0166708. doi: 10.1371/journal.pone.0166708. eCollection 2016.

DOI:10.1371/journal.pone.0166708

PMID:27851815

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5112854/

Abstract

Melioration learning is an empirically well-grounded model of reinforcement learning. By means of computer simulations, this paper derives predictions for several repeatedly played two-person games from this model. The results indicate a likely convergence to a pure Nash equilibrium of the game. If no pure equilibrium exists, the relative frequencies of choice may approach the predictions of the mixed Nash equilibrium. Yet in some games, no stable state is reached.

摘要

改进学习是一种有充分实证依据的强化学习模型。通过计算机模拟，本文从该模型推导出了几个重复进行的两人博弈的预测结果。结果表明，博弈很可能会收敛到纯纳什均衡。如果不存在纯均衡，选择的相对频率可能会接近混合纳什均衡的预测结果。然而，在某些博弈中，无法达到稳定状态。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbea/5112854/f31693a95d3f/pone.0166708.g001.jpg

相似文献

Melioration Learning in Two-Person Games.两人博弈中的改进学习

PLoS One. 2016 Nov 16;11(11):e0166708. doi: 10.1371/journal.pone.0166708. eCollection 2016.

Spike-based decision learning of Nash equilibria in two-player games.基于尖峰的二人博弈纳什均衡决策学习。

PLoS Comput Biol. 2012;8(9):e1002691. doi: 10.1371/journal.pcbi.1002691. Epub 2012 Sep 27.

Dynamical selection of Nash equilibria using reinforcement learning: Emergence of heterogeneous mixed equilibria.使用强化学习动态选择纳什均衡：异质混合均衡的出现。

PLoS One. 2018 Jul 9;13(7):e0196577. doi: 10.1371/journal.pone.0196577. eCollection 2018.

Operant matching as a Nash equilibrium of an intertemporal game.作为跨期博弈纳什均衡的操作性匹配

Neural Comput. 2009 Oct;21(10):2755-73. doi: 10.1162/neco.2009.09-08-854.

Game relativity: how context influences strategic decision making.博弈相对性：情境如何影响战略决策

J Exp Psychol Learn Mem Cogn. 2006 Jan;32(1):131-49. doi: 10.1037/0278-7393.32.1.131.

Reinforcement learning in complementarity game and population dynamics.

Phys Rev E Stat Nonlin Soft Matter Phys. 2014 Feb;89(2):022113. doi: 10.1103/PhysRevE.89.022113. Epub 2014 Feb 11.

Coevolutionary networks of reinforcement-learning agents.强化学习智能体的协同进化网络

Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jul;88(1):012815. doi: 10.1103/PhysRevE.88.012815. Epub 2013 Jul 24.

Decentralized learning in Markov games.马尔可夫博弈中的分布式学习

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):976-81. doi: 10.1109/TSMCB.2008.920998.

Multiagent reinforcement learning with unshared value functions.多智能体强化学习与非共享价值函数。

IEEE Trans Cybern. 2015 Apr;45(4):647-62. doi: 10.1109/TCYB.2014.2332042. Epub 2014 Jul 2.

Learning dynamics in social dilemmas.社会困境中的学习动态

Proc Natl Acad Sci U S A. 2002 May 14;99 Suppl 3(Suppl 3):7229-36. doi: 10.1073/pnas.092080099.

引用本文的文献

Multi-agent reinforcement learning with approximate model learning for competitive games.多智能体强化学习与近似模型学习在竞争性游戏中的应用。

PLoS One. 2019 Sep 11;14(9):e0222215. doi: 10.1371/journal.pone.0222215. eCollection 2019.

本文引用的文献

Reinforcement learning and human behavior.强化学习与人类行为。

Curr Opin Neurobiol. 2014 Apr;25:93-8. doi: 10.1016/j.conb.2013.12.004. Epub 2014 Jan 1.

Dynamics of Boltzmann Q learning in two-player two-action games.双人双行动博弈中玻尔兹曼Q学习的动力学

Phys Rev E Stat Nonlin Soft Matter Phys. 2012 Apr;85(4 Pt 1):041145. doi: 10.1103/PhysRevE.85.041145. Epub 2012 Apr 26.

Choice, matching, and human behavior: A review of the literature.选择、匹配与人类行为：文献综述

Behav Anal. 1983 Spring;6(1):57-76. doi: 10.1007/BF03391874.

Matching theory in natural human environments.自然人类环境中的匹配理论。

Behav Anal. 1988 Fall;11(2):95-109. doi: 10.1007/BF03392462.

Synaptic theory of replicator-like melioration.复制子类似改进的突触理论。

Front Comput Neurosci. 2010 Jun 17;4:17. doi: 10.3389/fncom.2010.00017. eCollection 2010.

Generality of the matching law as a descriptor of shot selection in basketball.匹配定律的普遍性可作为篮球投篮选择的描述符。

J Appl Behav Anal. 2009 Fall;42(3):595-608. doi: 10.1901/jaba.2009.42-595.

Concurrent performance in a three-alternative choice situation: response allocation in a Rock/Paper/Scissors game.三选一情境中的并发表现：石头/剪刀/布游戏中的反应分配

Behav Processes. 2009 Oct;82(2):164-72. doi: 10.1016/j.beproc.2009.06.004. Epub 2009 Jun 23.

An application of the matching law to social dynamics.匹配法则在社会动态中的应用。

J Appl Behav Anal. 2007 Winter;40(4):589-601. doi: 10.1901/jaba.2007.589-601.

Computational algorithms and neuronal network models underlying decision processes.决策过程背后的计算算法和神经网络模型。

Neural Netw. 2006 Oct;19(8):1091-105. doi: 10.1016/j.neunet.2006.05.034. Epub 2006 Aug 30.

Melioration, matching, and maximization.改善、匹配和最大化。

J Exp Anal Behav. 1981 Sep;36(2):141-9. doi: 10.1901/jeab.1981.36-141.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

两人博弈中的改进学习

Melioration Learning in Two-Person Games.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献