Suppr超能文献

基于协同设计优化在智能处理单元上加速化学图神经网络预测模型。

Acceleration of Graph Neural Network-Based Prediction Models in Chemistry via Co-Design Optimization on Intelligence Processing Units.

机构信息

Graphcore, Kett House, Station Rd, Cambridge CB1 2JH, U.K.

Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory, 1100 Dexter Ave N, Seattle, Washington 98109, United States.

出版信息

J Chem Inf Model. 2024 Mar 11;64(5):1568-1580. doi: 10.1021/acs.jcim.3c01312. Epub 2024 Feb 21.

Abstract

Atomic structure prediction and associated property calculations are the bedrock of chemical physics. Since high-fidelity ab initio modeling techniques for computing the structure and properties can be prohibitively expensive, this motivates the development of machine-learning (ML) models that make these predictions more efficiently. Training graph neural networks over large atomistic databases introduces unique computational challenges, such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs, such as social networks. We demonstrate a novel hardware-software codesign approach to scale up the training of atomistic graph neural networks (GNN) for structure and property prediction. First, to eliminate redundant computation and memory associated with alternative padding techniques and to improve throughput via minimizing communication, we formulate the effective coalescing of the batches of variable-size atomistic graphs as the bin packing problem and introduce a hardware-agnostic algorithm to pack these batches. In addition, we propose hardware-specific optimizations, including a planner and vectorization for the gather-scatter operations targeted for Graphcore's Intelligence Processing Unit (IPU), as well as model-specific optimizations such as merged communication collectives and optimized softplus. Putting these all together, we demonstrate the effectiveness of the proposed codesign approach by providing an implementation of a well-established atomistic GNN on the Graphcore IPUs. We evaluate the training performance on multiple atomistic graph databases with varying degrees of graph counts, sizes, and sparsity. We demonstrate that such a codesign approach can reduce the training time of atomistic GNNs and can improve their performance by up to 1.5× compared to the baseline implementation of the model on the IPUs. Additionally, we compare our IPU implementation with a Nvidia GPU-based implementation and show that our atomistic GNN implementation on the IPUs can run 1.8× faster on average compared to the execution time on the GPUs.

摘要

原子结构预测和相关性质计算是化学物理的基础。由于计算结构和性质的高精度从头算建模技术可能非常昂贵,因此需要开发能够更有效地进行这些预测的机器学习 (ML) 模型。在大型原子数据库上训练图神经网络会带来独特的计算挑战,例如需要处理具有可变大小的数百万个小图,并支持与学习大型图(例如社交网络)不同的通信模式。我们展示了一种新颖的软硬件协同设计方法,用于扩展原子图神经网络 (GNN) 的训练,以进行结构和性质预测。首先,为了消除替代填充技术相关的冗余计算和内存,并通过最小化通信来提高吞吐量,我们将可变大小原子图的批量有效合并形式化为装箱问题,并引入了一种硬件无关的算法来对这些批量进行装箱。此外,我们提出了硬件特定的优化,包括针对 Graphcore 智能处理单元 (IPU) 的 gather-scatter 操作的规划器和向量化,以及针对特定模型的优化,例如合并的通信集合和优化的 softplus。将所有这些放在一起,我们通过在 Graphcore IPUs 上实现一个成熟的原子 GNN 来展示所提出的协同设计方法的有效性。我们在具有不同图数量、大小和稀疏度的多个原子图数据库上评估训练性能。我们证明了这种协同设计方法可以减少原子 GNN 的训练时间,并可以将其性能提高多达 1.5 倍,与在 IPUs 上的模型的基线实现相比。此外,我们将我们的 IPU 实现与基于 Nvidia GPU 的实现进行了比较,并表明与在 GPU 上的执行时间相比,我们的 IPUs 上的原子 GNN 实现的平均运行速度快 1.8 倍。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验