Suppr超能文献

为新型架构生成高效量子化学代码。

Generating Efficient Quantum Chemistry Codes for Novel Architectures.

作者信息

Titov Alexey V, Ufimtsev Ivan S, Luehr Nathan, Martinez Todd J

机构信息

National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign , Urbana, Illinois 61801, United States.

Department of Chemistry and the PULSE Institute, Stanford University , Stanford, California 94305, United States.

出版信息

J Chem Theory Comput. 2013 Jan 8;9(1):213-21. doi: 10.1021/ct300321a. Epub 2012 Nov 12.

Abstract

We describe an extension of our graphics processing unit (GPU) electronic structure program TeraChem to include atom-centered Gaussian basis sets with d angular momentum functions. This was made possible by a "meta-programming" strategy that leverages computer algebra systems for the derivation of equations and their transformation to correct code. We generate a multitude of code fragments that are formally mathematically equivalent, but differ in their memory and floating-point operation footprints. We then select between different code fragments using empirical testing to find the highest performing code variant. This leads to an optimal balance of floating-point operations and memory bandwidth for a given target architecture without laborious manual tuning. We show that this approach is capable of similar performance compared to our hand-tuned GPU kernels for basis sets with s and p angular momenta. We also demonstrate that mixed precision schemes (using both single and double precision) remain stable and accurate for molecules with d functions. We provide benchmarks of the execution time of entire self-consistent field (SCF) calculations using our GPU code and compare to mature CPU based codes, showing the benefits of the GPU architecture for electronic structure theory with appropriately redesigned algorithms. We suggest that the meta-programming and empirical performance optimization approach may be important in future computational chemistry applications, especially in the face of quickly evolving computer architectures.

摘要

我们描述了对我们的图形处理单元(GPU)电子结构程序TeraChem的扩展,使其包含具有d角动量函数的以原子为中心的高斯基组。这是通过一种“元编程”策略实现的,该策略利用计算机代数系统来推导方程并将其转换为正确的代码。我们生成了大量在形式上数学等效但在内存和浮点运算占用方面有所不同的代码片段。然后,我们通过经验测试在不同的代码片段之间进行选择,以找到性能最佳的代码变体。这在无需费力手动调整的情况下,为给定的目标架构实现了浮点运算和内存带宽的最佳平衡。我们表明,与我们针对具有s和p角动量的基组手动调整的GPU内核相比,这种方法能够实现相似的性能。我们还证明,对于具有d函数的分子,混合精度方案(同时使用单精度和双精度)保持稳定且准确。我们提供了使用我们的GPU代码进行整个自洽场(SCF)计算的执行时间基准,并与成熟的基于CPU的代码进行比较,展示了在经过适当重新设计的算法下,GPU架构在电子结构理论方面的优势。我们认为,元编程和经验性能优化方法在未来的计算化学应用中可能很重要,尤其是面对快速发展的计算机架构时。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验