Bearden J N
Department of Psychology, University of North Carolina, Chapel Hill, NC 27514, USA.
Behav Res Methods Instrum Comput. 2001 May;33(2):124-9. doi: 10.3758/bf03195357.
We used genetic algorithms to evolve populations of reinforcement learning (Q-learning) agents to play a repeated two-player symmetric coordination game under different risk conditions and found that evolution steered our simulated populations to the Pareto inefficient equilibrium under high-risk conditions and to the Pareto efficient equilibrium under low-risk conditions. Greater degrees of forgiveness and temporal discounting of future returns emerged in populations playing the low-risk game. Results demonstrate the utility of simulation to evolutionary psychology.
我们使用遗传算法来演化强化学习(Q学习)智能体群体,使其在不同风险条件下进行重复的两人对称协调博弈,结果发现,在高风险条件下,演化引导我们的模拟群体趋向帕累托无效率均衡,而在低风险条件下则趋向帕累托有效均衡。在进行低风险博弈的群体中,出现了更大程度的宽容和对未来回报的时间贴现。研究结果证明了模拟对于进化心理学的实用性。