Shi Tianyi, Wang Zhenling, Aldossary Abdulrahman, Liu Yang, Li Xiaoye S, Head-Gordon Martin
Applied Mathematics and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States.
Department of Chemistry, University of California, Berkeley, California 94720, United States.
J Chem Theory Comput. 2024 Sep 2. doi: 10.1021/acs.jctc.4c01016.
In order to alleviate the computational burden associated with superlinear compute scalings with molecular size in electron correlation methods, researchers have developed local correlation methods that wisely treat relatively small contributions as zeros but still yield accurate energy approximation. Such local correlation techniques can also be combined with parallel computing resources to obtain further efficiency and scalability. This work focuses on the distributed memory parallel implementation of a local correlation method for second order Mo̷ller-Plesset (MP2) theory. This method also only has a single threshold to control the dropping of terms and accuracy of different computing kernels in the algorithm. The process partitioning strategy and distributed parallel implementation with the message passing interface (MPI) are discussed. In particular, the algorithm relies on a fixed sparsity pattern matrix multiplication and a corresponding distributed conjugate gradient solver, which exhibits almost linear scaling in both strong and weak scaling analyses. Numerical experiments on a range of molecules, including linear chains and molecules with 2 and 3-dimensional characters, are reported. For example, with only 32 MPI ranks, this MP2 implementation can calculate the correlation energy of vancomycin in def2-TZVP basis within 0.003% accuracy (10 threshold) in half an hour, where the same problem is unfeasible to solve with sequential or pure shared memory implementations.
为了减轻电子相关方法中与分子大小的超线性计算缩放相关的计算负担,研究人员开发了局部相关方法,该方法明智地将相对较小的贡献视为零,但仍能产生准确的能量近似值。这种局部相关技术还可以与并行计算资源相结合,以获得更高的效率和可扩展性。这项工作专注于二阶莫勒-普列斯特定理(MP2)理论的局部相关方法的分布式内存并行实现。该方法在算法中也只有一个阈值来控制项的舍弃和不同计算内核的精度。讨论了进程划分策略和使用消息传递接口(MPI)的分布式并行实现。特别地,该算法依赖于固定稀疏模式矩阵乘法和相应的分布式共轭梯度求解器,在强缩放和弱缩放分析中均呈现出几乎线性的缩放。报告了对一系列分子的数值实验,包括线性链以及具有二维和三维特征的分子。例如,仅使用32个MPI进程,此MP2实现可以在半小时内以0.003%的精度(10阈值)计算出在def2-TZVP基组下万古霉素的相关能,而对于相同问题,顺序或纯共享内存实现则无法求解。