DeepMind, London, UK.
Team Liquid, Utrecht, Netherlands.
Nature. 2019 Nov;575(7782):350-354. doi: 10.1038/s41586-019-1724-z. Epub 2019 Oct 30.
Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.
许多实际应用都需要人工智能代理在复杂环境中与其他代理竞争和协作。星际争霸领域成为人工智能研究的一个重要挑战,因为它在专业电子竞技中具有标志性和持久的地位,并且在原始复杂性和多代理挑战方面与现实世界相关。在十年的时间里和无数次的比赛中,最强的代理简化了游戏的重要方面,利用了超人的能力,或者采用了手工制作的子系统。尽管有这些优势,但之前没有一个代理能够接近顶级星际争霸选手的整体技能。我们选择使用通用学习方法来解决星际争霸的挑战,这些方法原则上适用于其他复杂领域:一种多代理强化学习算法,它在不断适应策略和反策略的多样化联盟中使用来自人类和代理游戏的数据,每个策略和反策略都由深度神经网络表示。我们通过一系列与人类玩家的在线游戏来评估我们的代理 AlphaStar 在星际争霸 II 中的表现。AlphaStar 在星际争霸的三个种族中都被评为大师级别,并且超过了 99.8%的官方排名人类玩家。