Kim Inkoo, Jeong Daun, Weisburn Leah P, Alexiu Alexandra, Van Voorhis Troy, Rhee Young Min, Son Won-Joon, Kim Hyung-Jin, Yim Jinkyu, Kim Sungmin, Cho Yeonchoo, Jang Inkook, Lee Seungmin, Kim Dae Sin
Innovation Center, Samsung Electronics, Hwaseong 18448, Republic of Korea.
Department of Chemistry, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, United States.
J Chem Theory Comput. 2024 Oct 22;20(20):9018-9031. doi: 10.1021/acs.jctc.4c01003. Epub 2024 Oct 7.
Modern graphics processing units (GPUs) provide an unprecedented level of computing power. In this study, we present a high-performance, multi-GPU implementation of the analytical nuclear gradient for Kohn-Sham time-dependent density functional theory (TDDFT), employing the Tamm-Dancoff approximation (TDA) and Gaussian-type atomic orbitals as basis functions. We discuss GPU-efficient algorithms for the derivatives of electron repulsion integrals and exchange-correlation functionals within the range-separated scheme. As an illustrative example, we calculate the TDA-TDDFT gradient of the S state of a full-scale green fluorescent protein with explicit water solvent molecules, totaling 4353 atoms, at the ωB97X/def2-SVP level of theory. Our algorithm demonstrates favorable parallel efficiencies on a high-speed distributed system equipped with 256 Nvidia A100 GPUs, achieving >70% with up to 64 GPUs and 31% with 256 GPUs, effectively leveraging the capabilities of modern high-performance computing systems.
现代图形处理单元(GPU)提供了前所未有的计算能力。在本研究中,我们提出了一种用于含时密度泛函理论(TDDFT)的解析核梯度的高性能多GPU实现方法,该方法采用Tamm-Dancoff近似(TDA)并以高斯型原子轨道作为基函数。我们讨论了在范围分离方案中电子排斥积分和交换相关泛函导数的GPU高效算法。作为一个示例,我们在ωB97X/def2-SVP理论水平下,计算了具有明确水分子溶剂的全尺寸绿色荧光蛋白S态的TDA-TDDFT梯度,该体系共有4353个原子。我们的算法在配备256个英伟达A100 GPU的高速分布式系统上展示了良好的并行效率,在多达64个GPU时达到>70%,在256个GPU时达到31%,有效地利用了现代高性能计算系统的能力。