• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度强化学习发现更快的排序算法。

Faster sorting algorithms discovered using deep reinforcement learning.

机构信息

Deepmind, London, UK.

Google, Mountain View, CA, USA.

出版信息

Nature. 2023 Jun;618(7964):257-263. doi: 10.1038/s41586-023-06004-9. Epub 2023 Jun 7.

DOI:10.1038/s41586-023-06004-9
PMID:37286649
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10247365/
Abstract

Fundamental algorithms such as sorting or hashing are used trillions of times on any given day. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past, making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.

摘要

基本算法,如排序或哈希,在任何给定的一天都被使用数万亿次。随着对计算的需求不断增长,这些算法的性能变得至关重要。尽管过去已经取得了显著的进展,但对于人类科学家和计算方法来说,进一步提高这些例程的效率仍然具有挑战性。在这里,我们展示了人工智能如何通过发现以前未知的例程来超越当前的技术水平。为了实现这一目标,我们将寻找更好的排序例程的任务表述为一个单人游戏。然后,我们训练了一个新的深度强化学习代理 AlphaDev 来玩这个游戏。AlphaDev 从零开始发现了一些性能优于先前已知人类基准的小型排序算法。这些算法已经被集成到 LLVM 标准 C++排序库中。这种对排序库的这一部分的更改代表了使用强化学习自动发现的算法替换了一个组件。我们还在其他领域展示了结果,展示了该方法的通用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/448f668f5386/41586_2023_6004_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/0c2ff5430909/41586_2023_6004_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/454f474a84a4/41586_2023_6004_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/2de86ab7cc40/41586_2023_6004_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/162b69a89da2/41586_2023_6004_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/c8058f1d70ca/41586_2023_6004_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/75ddcbca10b3/41586_2023_6004_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/448f668f5386/41586_2023_6004_Fig7_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/0c2ff5430909/41586_2023_6004_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/454f474a84a4/41586_2023_6004_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/2de86ab7cc40/41586_2023_6004_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/162b69a89da2/41586_2023_6004_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/c8058f1d70ca/41586_2023_6004_Fig5_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/75ddcbca10b3/41586_2023_6004_Fig6_ESM.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/063e/10247365/448f668f5386/41586_2023_6004_Fig7_ESM.jpg

相似文献

1
Faster sorting algorithms discovered using deep reinforcement learning.利用深度强化学习发现更快的排序算法。
Nature. 2023 Jun;618(7964):257-263. doi: 10.1038/s41586-023-06004-9. Epub 2023 Jun 7.
2
Discovering faster matrix multiplication algorithms with reinforcement learning.用强化学习发现更快的矩阵乘法算法。
Nature. 2022 Oct;610(7930):47-53. doi: 10.1038/s41586-022-05172-4. Epub 2022 Oct 5.
3
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
4
Evolving interpretable plasticity for spiking networks.用于脉冲神经网络的不断发展的可解释可塑性。
Elife. 2021 Oct 28;10:e66273. doi: 10.7554/eLife.66273.
5
Countering a Drone in a 3D Space: Analyzing Deep Reinforcement Learning Methods.在三维空间中对抗无人机:分析深度强化学习方法。
Sensors (Basel). 2022 Nov 16;22(22):8863. doi: 10.3390/s22228863.
6
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.一种通过自我对弈掌握国际象棋、将棋和围棋的通用强化学习算法。
Science. 2018 Dec 7;362(6419):1140-1144. doi: 10.1126/science.aar6404.
7
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
8
Curriculum Reinforcement Learning Based on K-Fold Cross Validation.基于K折交叉验证的课程强化学习
Entropy (Basel). 2022 Dec 6;24(12):1787. doi: 10.3390/e24121787.
9
Investigation of independent reinforcement learning algorithms in multi-agent environments.多智能体环境中独立强化学习算法的研究
Front Artif Intell. 2022 Sep 20;5:805823. doi: 10.3389/frai.2022.805823. eCollection 2022.
10
Intelligent sort-timing prediction for image-activated cell sorting.图像激活细胞分选的智能分选时间预测
Cytometry A. 2023 Jan;103(1):88-97. doi: 10.1002/cyto.a.24664. Epub 2022 Jun 29.

引用本文的文献

1
The STARD-AI reporting guideline for diagnostic accuracy studies using artificial intelligence.使用人工智能的诊断准确性研究的STARD-AI报告指南。
Nat Med. 2025 Sep 15. doi: 10.1038/s41591-025-03953-8.
2
Enterprise fission path optimization and dynamic capability construction based on the soft actor-critic algorithm.基于软演员-评论家算法的企业裂变路径优化与动态能力构建
Sci Rep. 2025 Jul 1;15(1):20942. doi: 10.1038/s41598-025-06180-w.
3
Machine Learning Discovers Numerous New Computational Principles Supporting Elementary Motion Detection.
机器学习发现众多支持基本运动检测的新计算原理。
bioRxiv. 2025 May 29:2025.05.26.656164. doi: 10.1101/2025.05.26.656164.
4
Generative artificial intelligence: a historical perspective.生成式人工智能:历史视角
Natl Sci Rev. 2025 Feb 21;12(5):nwaf050. doi: 10.1093/nsr/nwaf050. eCollection 2025 May.
5
Optimizing generative AI by backpropagating language model feedback.通过反向传播语言模型反馈来优化生成式人工智能。
Nature. 2025 Mar;639(8055):609-616. doi: 10.1038/s41586-025-08661-4. Epub 2025 Mar 19.
6
ECG data analysis to determine ST-segment elevation myocardial infarction and infarction territory type: an integrative approach of artificial intelligence and clinical guidelines.用于确定ST段抬高型心肌梗死及梗死区域类型的心电图数据分析:人工智能与临床指南的综合方法
Front Physiol. 2024 Oct 7;15:1462847. doi: 10.3389/fphys.2024.1462847. eCollection 2024.
7
Mathematical discoveries from program search with large language models.基于大语言模型的程序搜索中的数学发现。
Nature. 2024 Jan;625(7995):468-475. doi: 10.1038/s41586-023-06924-6. Epub 2023 Dec 14.
8
Harnessing deep learning for population genetic inference.利用深度学习进行群体遗传推断。
Nat Rev Genet. 2024 Jan;25(1):61-78. doi: 10.1038/s41576-023-00636-3. Epub 2023 Sep 4.
9
DeepMind AI creates algorithms that sort data faster than those built by people.深度思维人工智能创建的算法比人工构建的算法能更快地对数据进行排序。
Nature. 2023 Jun;618(7965):443-444. doi: 10.1038/d41586-023-01883-4.