Suppr超能文献

马尔可夫博弈中的分布式学习

Decentralized learning in Markov games.

作者信息

Vrancx Peter, Verbeeck Katja, Nowé Ann

机构信息

Computational Modeling Laboratory (COMO), Vrije Universiteit Brussel, 1050 Brussels, Belgium.

出版信息

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):976-81. doi: 10.1109/TSMCB.2008.920998.

Abstract

Learning automata (LA) were recently shown to be valuable tools for designing multiagent reinforcement learning algorithms. One of the principal contributions of the LA theory is that a set of decentralized independent LA is able to control a finite Markov chain with unknown transition probabilities and rewards. In this paper, we propose to extend this algorithm to Markov games--a straightforward extension of single-agent Markov decision problems to distributed multiagent decision problems. We show that under the same ergodic assumptions of the original theorem, the extended algorithm will converge to a pure equilibrium point between agent policies.

摘要

学习自动机(LA)最近被证明是设计多智能体强化学习算法的有价值工具。LA理论的主要贡献之一是,一组分散的独立学习自动机能够控制一个具有未知转移概率和奖励的有限马尔可夫链。在本文中,我们提议将该算法扩展到马尔可夫博弈——单智能体马尔可夫决策问题到分布式多智能体决策问题的直接扩展。我们表明,在原始定理的相同遍历假设下,扩展算法将收敛到智能体策略之间的纯均衡点。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验