用于涉及大量激发态的单激发计算的双缓冲异构CPU+GPU积分消化算法

Double-buffered, heterogeneous CPU + GPU integral digestion algorithm for single-excitation calculations involving a large number of excited states.

作者信息

Morrison Adrian F, Epifanovsky Evgeny, Herbert John M

机构信息

Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio.

Q-Chem Inc., Pleasanton, California.

出版信息

J Comput Chem. 2018 Oct 5;39(26):2173-2182. doi: 10.1002/jcc.25531. Epub 2018 Oct 3.

DOI:10.1002/jcc.25531

PMID:30368836

Abstract

The most widely used quantum-chemical models for excited states are single-excitation theories, a category that includes configuration interaction with single substitutions, time-dependent density functional theory, and also a recently developed ab initio exciton model. When a large number of excited states are desired, these calculations incur a significant bottleneck in the "digestion" step in which two-electron integrals are contracted with density or density-like matrices. We present an implementation that moves this step onto graphical processing units (GPUs), and introduce a double-buffer scheme that minimizes latency by computing integrals on the central processing units (CPUs) concurrently with their digestion on the GPUs. An automatic code generation scheme simplifies the implementation of high-performance GPU kernels. For the exciton model, which requires separate excited-state calculations on each electronically coupled chromophore, the heterogeneous implementation described here results in speedups of 2-6× versus a CPU-only implementation. For traditional time-dependent density functional theory calculations, we obtain speedups of up to 5× when a large number of excited states is computed. © 2018 Wiley Periodicals, Inc.

摘要

用于激发态的最广泛使用的量子化学模型是单激发理论，这一类别包括单取代组态相互作用、含时密度泛函理论，以及最近发展的从头算激子模型。当需要大量激发态时，这些计算在“消化”步骤中会产生显著瓶颈，在该步骤中，双电子积分要与密度或类似密度的矩阵进行缩并。我们提出一种实现方法，将此步骤转移到图形处理单元（GPU）上，并引入一种双缓冲方案，通过在中央处理器（CPU）上计算积分的同时在GPU上进行积分“消化”，从而将延迟降至最低。一种自动代码生成方案简化了高性能GPU内核的实现。对于激子模型，该模型需要对每个电子耦合发色团进行单独的激发态计算，此处描述的异构实现与仅使用CPU的实现相比，加速比为2至6倍。对于传统的含时密度泛函理论计算，当计算大量激发态时，我们可获得高达5倍的加速比。© 2018威利期刊公司。

相似文献

Double-buffered, heterogeneous CPU + GPU integral digestion algorithm for single-excitation calculations involving a large number of excited states.

J Comput Chem. 2018 Oct 5;39(26):2173-2182. doi: 10.1002/jcc.25531. Epub 2018 Oct 3.

Beyond Time-Dependent Density Functional Theory Using Only Single Excitations: Methods for Computational Studies of Excited States in Complex Systems.

Acc Chem Res. 2016 May 17;49(5):931-41. doi: 10.1021/acs.accounts.6b00047. Epub 2016 Apr 21.

Excited-State Electronic Structure with Configuration Interaction Singles and Tamm-Dancoff Time-Dependent Density Functional Theory on Graphical Processing Units.

J Chem Theory Comput. 2011 Jun 14;7(6):1814-1823. doi: 10.1021/ct200030k. Epub 2011 May 12.

Acceleration of High Angular Momentum Electron Repulsion Integrals and Integral Derivatives on Graphics Processing Units.

J Chem Theory Comput. 2015 Apr 14;11(4):1449-62. doi: 10.1021/ct500984t. Epub 2015 Mar 9.

Faster Self-Consistent Field (SCF) Calculations on GPU Clusters.

J Chem Theory Comput. 2021 Dec 14;17(12):7486-7503. doi: 10.1021/acs.jctc.1c00720. Epub 2021 Nov 15.

Hybrid CPU/GPU Integral Engine for Strong-Scaling Ab Initio Methods.

J Chem Theory Comput. 2017 Jul 11;13(7):3153-3159. doi: 10.1021/acs.jctc.6b01166. Epub 2017 Jun 21.

Accelerating Coupled-Cluster Calculations with GPUs: An Implementation of the Density-Fitted CCSD(T) Approach for Heterogeneous Computing Architectures Using OpenMP Directives.

J Chem Theory Comput. 2023 Nov 14;19(21):7640-7657. doi: 10.1021/acs.jctc.3c00876. Epub 2023 Oct 25.

Acceleration of Electron Repulsion Integral Evaluation on Graphics Processing Units via Use of Recurrence Relations.

J Chem Theory Comput. 2013 Feb 12;9(2):965-76. doi: 10.1021/ct300754n. Epub 2013 Jan 4.

Quantum Chemistry on Graphical Processing Units. 1. Strategies for Two-Electron Integral Evaluation.

J Chem Theory Comput. 2008 Feb;4(2):222-31. doi: 10.1021/ct700268q.

Computing the Density Matrix in Electronic Structure Theory on Graphics Processing Units.

J Chem Theory Comput. 2012 Nov 13;8(11):4094-101. doi: 10.1021/ct300442w. Epub 2012 Oct 8.

引用本文的文献

Software for the frontiers of quantum chemistry: An overview of developments in the Q-Chem 5 package.

J Chem Phys. 2021 Aug 28;155(8):084801. doi: 10.1063/5.0055522.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于涉及大量激发态的单激发计算的双缓冲异构CPU+GPU积分消化算法

Double-buffered, heterogeneous CPU + GPU integral digestion algorithm for single-excitation calculations involving a large number of excited states.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献