文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

用深度神经网络和树搜索掌握围棋游戏。

Mastering the game of Go with deep neural networks and tree search.

机构信息

Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.

Google, 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.

出版信息

Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.


DOI:10.1038/nature16961
PMID:26819042
Abstract

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

摘要

围棋一直被视为人工智能领域最具挑战性的经典游戏之一,因为其搜索空间巨大,而且很难评估棋盘位置和走法。在这里,我们引入了一种新的围棋计算机程序,它使用“价值网络”来评估棋盘位置,使用“策略网络”来选择走法。这些深度神经网络是通过结合人类专家游戏的监督学习和自我对弈的强化学习进行训练的。无需任何展望搜索,神经网络就可以达到模拟数千次自我对弈的最新蒙特卡洛树搜索程序的水平。我们还引入了一种新的搜索算法,将蒙特卡洛模拟与价值和策略网络相结合。使用这种搜索算法,我们的程序 AlphaGo 对其他围棋程序的胜率达到了 99.8%,并以 5 比 0 的比分击败了欧洲围棋冠军。这是计算机程序首次在完整的围棋比赛中击败人类职业选手,此前人们认为至少需要十年时间才能实现这一壮举。

相似文献

[1]
Mastering the game of Go with deep neural networks and tree search.

Nature. 2016-1-28

[2]
Mastering the game of Go without human knowledge.

Nature. 2017-10-18

[3]
Google AI algorithm masters ancient game of Go.

Nature. 2016-1-28

[4]
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.

Science. 2018-12-7

[5]
Learning to Play the Chess Variant Crazyhouse Above World Champion Level With Deep Neural Networks and Human Data.

Front Artif Intell. 2020-4-28

[6]
Evolutionary swarm neural network game engine for Capture Go.

Neural Netw. 2009-11-20

[7]
Learning to play Go using recursive neural networks.

Neural Netw. 2008-11

[8]
AlphaDDA: strategies for adjusting the playing strength of a fully trained AlphaZero system to a suitable human training partner.

PeerJ Comput Sci. 2022-10-4

[9]
[Deep Learning and AlphaGo].

Brain Nerve. 2019-7

[10]
Human-level control through deep reinforcement learning.

Nature. 2015-2-26

引用本文的文献

[1]
Probing for consciousness in machines.

Front Artif Intell. 2025-8-20

[2]
Photonics and microwaves merge to improve computing flexibility.

Light Sci Appl. 2025-9-4

[3]
Toward the Uniform of Chemical Theory, Simulation, and Experiments in Metaverse Technology.

Precis Chem. 2023-6-14

[4]
Machine learning for estimation and control of quantum systems.

Natl Sci Rev. 2025-7-7

[5]
A framework for robotic manipulation tasks based on multiple zero shot models.

Sci Rep. 2025-8-24

[6]
AlphaFold 3: an unprecedent opportunity for fundamental research and drug development.

Precis Clin Med. 2025-7-1

[7]
Data-driven equation discovery reveals nonlinear reinforcement learning in humans.

Proc Natl Acad Sci U S A. 2025-8-5

[8]
Artificial Intelligence in Thoracic Surgery: Transforming Diagnostics, Treatment, and Patient Outcomes.

Diagnostics (Basel). 2025-7-8

[9]
Integrated biotechnological and AI innovations for crop improvement.

Nature. 2025-7

[10]
Multi-fidelity neural network-based prediction of tensile strength of high-entropy alloy (FeNiCoCrCu) using molecular dynamics data.

J Mol Model. 2025-7-22

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索