Zschache Johannes
Institute of Sociology, Leipzig University, Leipzig, Germany.
PLoS One. 2016 Nov 16;11(11):e0166708. doi: 10.1371/journal.pone.0166708. eCollection 2016.
Melioration learning is an empirically well-grounded model of reinforcement learning. By means of computer simulations, this paper derives predictions for several repeatedly played two-person games from this model. The results indicate a likely convergence to a pure Nash equilibrium of the game. If no pure equilibrium exists, the relative frequencies of choice may approach the predictions of the mixed Nash equilibrium. Yet in some games, no stable state is reached.
改进学习是一种有充分实证依据的强化学习模型。通过计算机模拟,本文从该模型推导出了几个重复进行的两人博弈的预测结果。结果表明,博弈很可能会收敛到纯纳什均衡。如果不存在纯均衡,选择的相对频率可能会接近混合纳什均衡的预测结果。然而,在某些博弈中,无法达到稳定状态。