Deepmind, London, UK.
Google, Mountain View, CA, USA.
Nature. 2023 Jun;618(7964):257-263. doi: 10.1038/s41586-023-06004-9. Epub 2023 Jun 7.
Fundamental algorithms such as sorting or hashing are used trillions of times on any given day. As demand for computation grows, it has become critical for these algorithms to be as performant as possible. Whereas remarkable progress has been achieved in the past, making further improvements on the efficiency of these routines has proved challenging for both human scientists and computational approaches. Here we show how artificial intelligence can go beyond the current state of the art by discovering hitherto unknown routines. To realize this, we formulated the task of finding a better sorting routine as a single-player game. We then trained a new deep reinforcement learning agent, AlphaDev, to play this game. AlphaDev discovered small sorting algorithms from scratch that outperformed previously known human benchmarks. These algorithms have been integrated into the LLVM standard C++ sort library. This change to this part of the sort library represents the replacement of a component with an algorithm that has been automatically discovered using reinforcement learning. We also present results in extra domains, showcasing the generality of the approach.
基本算法,如排序或哈希,在任何给定的一天都被使用数万亿次。随着对计算的需求不断增长,这些算法的性能变得至关重要。尽管过去已经取得了显著的进展,但对于人类科学家和计算方法来说,进一步提高这些例程的效率仍然具有挑战性。在这里,我们展示了人工智能如何通过发现以前未知的例程来超越当前的技术水平。为了实现这一目标,我们将寻找更好的排序例程的任务表述为一个单人游戏。然后,我们训练了一个新的深度强化学习代理 AlphaDev 来玩这个游戏。AlphaDev 从零开始发现了一些性能优于先前已知人类基准的小型排序算法。这些算法已经被集成到 LLVM 标准 C++排序库中。这种对排序库的这一部分的更改代表了使用强化学习自动发现的算法替换了一个组件。我们还在其他领域展示了结果,展示了该方法的通用性。