模拟猎鹿博弈中无效率的演变

The evolution of inefficiency in a simulated stag hunt.

作者信息

Bearden J N

机构信息

Department of Psychology, University of North Carolina, Chapel Hill, NC 27514, USA.

出版信息

Behav Res Methods Instrum Comput. 2001 May;33(2):124-9. doi: 10.3758/bf03195357.

DOI:10.3758/bf03195357

PMID:11447664

Abstract

We used genetic algorithms to evolve populations of reinforcement learning (Q-learning) agents to play a repeated two-player symmetric coordination game under different risk conditions and found that evolution steered our simulated populations to the Pareto inefficient equilibrium under high-risk conditions and to the Pareto efficient equilibrium under low-risk conditions. Greater degrees of forgiveness and temporal discounting of future returns emerged in populations playing the low-risk game. Results demonstrate the utility of simulation to evolutionary psychology.

摘要

我们使用遗传算法来演化强化学习（Q学习）智能体群体，使其在不同风险条件下进行重复的两人对称协调博弈，结果发现，在高风险条件下，演化引导我们的模拟群体趋向帕累托无效率均衡，而在低风险条件下则趋向帕累托有效均衡。在进行低风险博弈的群体中，出现了更大程度的宽容和对未来回报的时间贴现。研究结果证明了模拟对于进化心理学的实用性。