• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用深度强化学习赶超 Gran Turismo 冠军车手。

Outracing champion Gran Turismo drivers with deep reinforcement learning.

机构信息

Sony AI, New York, NY, USA.

Sony AI, Tokyo, Japan.

出版信息

Nature. 2022 Feb;602(7896):223-228. doi: 10.1038/s41586-021-04357-7. Epub 2022 Feb 9.

DOI:10.1038/s41586-021-04357-7
PMID:35140384
Abstract

Many potential applications of artificial intelligence involve making real-time decisions in physical systems while interacting with humans. Automobile racing represents an extreme example of these conditions; drivers must execute complex tactical manoeuvres to pass or block opponents while operating their vehicles at their traction limits. Racing simulations, such as the PlayStation game Gran Turismo, faithfully reproduce the non-linear control challenges of real race cars while also encapsulating the complex multi-agent interactions. Here we describe how we trained agents for Gran Turismo that can compete with the world's best e-sports drivers. We combine state-of-the-art, model-free, deep reinforcement learning algorithms with mixed-scenario training to learn an integrated control policy that combines exceptional speed with impressive tactics. In addition, we construct a reward function that enables the agent to be competitive while adhering to racing's important, but under-specified, sportsmanship rules. We demonstrate the capabilities of our agent, Gran Turismo Sophy, by winning a head-to-head competition against four of the world's best Gran Turismo drivers. By describing how we trained championship-level racers, we demonstrate the possibilities and challenges of using these techniques to control complex dynamical systems in domains where agents must respect imprecisely defined human norms.

摘要

许多人工智能的潜在应用都涉及在与人类交互的同时在物理系统中实时做出决策。赛车代表了这些条件的极端例子;驾驶员必须在操作车辆达到其牵引力极限的同时,执行复杂的战术机动来超越或阻挡对手。赛车模拟游戏,如 PlayStation 游戏《Gran Turismo》,忠实地再现了真实赛车的非线性控制挑战,同时还包含了复杂的多代理交互。在这里,我们描述了如何为 Gran Turismo 训练可以与世界上最好的电子竞技驾驶员竞争的代理。我们将最先进的无模型深度强化学习算法与混合场景训练相结合,以学习一种集成的控制策略,将卓越的速度与令人印象深刻的战术相结合。此外,我们构建了一个奖励函数,使代理能够在遵守赛车重要但规定不明确的体育道德规则的同时具有竞争力。我们通过与世界上最好的四名 Gran Turismo 驾驶员进行一对一的比赛,展示了我们的代理 Gran Turismo Sophy 的能力。通过描述我们如何训练冠军级赛车手,我们展示了在代理必须尊重定义不精确的人类规范的领域中使用这些技术控制复杂动力系统的可能性和挑战。

相似文献

1
Outracing champion Gran Turismo drivers with deep reinforcement learning.用深度强化学习赶超 Gran Turismo 冠军车手。
Nature. 2022 Feb;602(7896):223-228. doi: 10.1038/s41586-021-04357-7. Epub 2022 Feb 9.
2
Neural networks overtake humans in Gran Turismo racing game.神经网络在《GT赛车》游戏中超越了人类。
Nature. 2022 Feb;602(7896):213-214. doi: 10.1038/d41586-022-00304-2.
3
Grandmaster level in StarCraft II using multi-agent reinforcement learning.星际争霸 II 中的大师级水平使用多智能体强化学习。
Nature. 2019 Nov;575(7782):350-354. doi: 10.1038/s41586-019-1724-z. Epub 2019 Oct 30.
4
Human-level performance in 3D multiplayer games with population-based reinforcement learning.基于群体强化学习的 3D 多人游戏中的人类水平表现。
Science. 2019 May 31;364(6443):859-865. doi: 10.1126/science.aau6249.
5
Collaborative hunting in artificial agents with deep reinforcement learning.深度强化学习中的人工代理协同捕猎。
Elife. 2024 May 7;13:e85694. doi: 10.7554/eLife.85694.
6
Multi-agent reinforcement learning with approximate model learning for competitive games.多智能体强化学习与近似模型学习在竞争性游戏中的应用。
PLoS One. 2019 Sep 11;14(9):e0222215. doi: 10.1371/journal.pone.0222215. eCollection 2019.
7
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
8
Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.基于鲸鱼优化算法优化自动驾驶中深度强化学习的超参数。
PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.
9
All by Myself: Learning individualized competitive behavior with a contrastive reinforcement learning optimization.独自学习:用对比强化学习优化来学习个性化竞争行为。
Neural Netw. 2022 Jun;150:364-376. doi: 10.1016/j.neunet.2022.03.013. Epub 2022 Mar 18.
10
Street racing video games and risk-taking driving: An Internet survey of automobile enthusiasts.街头赛车视频游戏与冒险驾驶行为:汽车爱好者的网络调查
Accid Anal Prev. 2013 Jan;50:1-7. doi: 10.1016/j.aap.2012.09.022. Epub 2012 Nov 3.

引用本文的文献

1
The analysis of deep reinforcement learning for dynamic graphical games under artificial intelligence.人工智能下动态图形游戏的深度强化学习分析
Sci Rep. 2025 Jul 2;15(1):23133. doi: 10.1038/s41598-025-05192-w.
2
Enterprise fission path optimization and dynamic capability construction based on the soft actor-critic algorithm.基于软演员-评论家算法的企业裂变路径优化与动态能力构建
Sci Rep. 2025 Jul 1;15(1):20942. doi: 10.1038/s41598-025-06180-w.
3
Multi-timescale reinforcement learning in the brain.大脑中的多时间尺度强化学习。
Nature. 2025 Jun 4. doi: 10.1038/s41586-025-08929-9.
4
Foveal vision reduces neural resources in agent-based game learning.中央凹视觉减少了基于智能体的游戏学习中的神经资源。
Front Neurosci. 2025 Mar 11;19:1547264. doi: 10.3389/fnins.2025.1547264. eCollection 2025.
5
A New Perspective on Precision Medicine: The Power of Digital Organoids.精准医学的新视角:数字类器官的力量。
Biomater Res. 2025 Mar 24;29:0171. doi: 10.34133/bmr.0171. eCollection 2025.
6
An opponent striatal circuit for distributional reinforcement learning.用于分布式强化学习的对侧纹状体回路。
Nature. 2025 Mar;639(8055):717-726. doi: 10.1038/s41586-024-08488-5. Epub 2025 Feb 19.
7
Noise Resilience of Successor and Predecessor Feature Algorithms in One- and Two-Dimensional Environments.一维和二维环境中后继与前驱特征算法的抗噪能力
Sensors (Basel). 2025 Feb 6;25(3):979. doi: 10.3390/s25030979.
8
A platform-agnostic deep reinforcement learning framework for effective Sim2Real transfer towards autonomous driving.一个与平台无关的深度强化学习框架,用于实现向自动驾驶的有效模拟到真实迁移。
Commun Eng. 2024 Oct 17;3(1):147. doi: 10.1038/s44172-024-00292-3.
9
Quantifying the use and potential benefits of artificial intelligence in scientific research.量化人工智能在科学研究中的应用及潜在益处。
Nat Hum Behav. 2024 Dec;8(12):2281-2292. doi: 10.1038/s41562-024-02020-5. Epub 2024 Oct 11.
10
Toward an AI Era: Advances in Electronic Skins.迈向人工智能时代:电子皮肤的进展。
Chem Rev. 2024 Sep 11;124(17):9899-9948. doi: 10.1021/acs.chemrev.4c00049. Epub 2024 Aug 28.