Suppr超能文献

高性能、图形处理单元加速的福克构建算法

High-Performance, Graphics Processing Unit-Accelerated Fock Build Algorithm.

作者信息

Barca Giuseppe M J, Galvez-Vallejo Jorge L, Poole David L, Rendell Alistair P, Gordon Mark S

机构信息

Research School of Computer Science, Australian National University, Canberra, Australian Capital Territory 2601, Australia.

Department of Chemistry and Ames Laboratory, Iowa State University, Ames, Iowa 50011, United States.

出版信息

J Chem Theory Comput. 2020 Dec 8;16(12):7232-7238. doi: 10.1021/acs.jctc.0c00768. Epub 2020 Nov 18.

Abstract

We present a high-performance, GPU (graphics processing unit)-accelerated algorithm for building the Fock matrix. The algorithm is designed for efficient calculations on large molecular systems and uses a novel dynamic load balancing scheme that maximizes the GPU throughput and avoids thread divergence that could occur due to integral screening. Additionally, the code adopts a novel ERI digestion algorithm that exploits all forms of permutational symmetry, combines efficiently the evaluation of both Coulomb and exchange terms together, and eliminates explicit thread synchronization requirements. Performance results obtained using a number of large molecules reveal remarkable speedups up to 24.4× with respect to the QUICK GPU code and up to 237× with respect to the GAMESS CPU parallel code.

摘要

我们提出了一种用于构建福克矩阵的高性能、GPU(图形处理单元)加速算法。该算法专为在大分子系统上进行高效计算而设计,并采用了一种新颖的动态负载平衡方案,该方案可最大化GPU吞吐量并避免因积分筛选可能出现的线程发散。此外,该代码采用了一种新颖的电子排斥积分(ERI)消解算法,该算法利用了所有形式的置换对称性,将库仑项和交换项的评估有效地结合在一起,并消除了显式的线程同步要求。使用多个大分子获得的性能结果表明,相对于QUICK GPU代码,加速比高达24.4倍,相对于GAMESS CPU并行代码,加速比高达237倍。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验