Department of Chemistry, KAIST, 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea.
J Chem Theory Comput. 2023 Mar 14;19(5):1457-1465. doi: 10.1021/acs.jctc.2c00983. Epub 2023 Feb 22.
Single precision (SP) arithmetic can be greatly accelerated as compared to double precision (DP) arithmetic on graphics processing units (GPUs). However, the use of SP in the whole process of electronic structure calculations is inappropriate for the required accuracy. We propose a 3-fold dynamic precision approach for accelerated calculations but still with the accuracy of DP. Here, SP, DP, and mixed precision are dynamically switched during an iterative diagonalization process. We applied this approach to the locally optimal block preconditioned conjugate gradient method to accelerate a large-scale eigenvalue solver for the Kohn-Sham equation. We determined a proper threshold for switching each precision scheme by examining the convergence pattern on the eigenvalue solver only with the kinetic energy operator of the Kohn-Sham Hamiltonian. As a result, we achieved up to 8.53× and 6.60× speedups for band structure and self-consistent field calculations, respectively, for test systems under various boundary conditions on NVIDIA GPUs.
与图形处理单元(GPU)上的双精度(DP)运算相比,单精度(SP)运算可以大大加快速度。然而,在电子结构计算的整个过程中使用 SP 对于所需的精度是不合适的。我们提出了一种 3 倍动态精度方法来加速计算,但仍具有 DP 的精度。在这里,SP、DP 和混合精度在迭代对角化过程中动态切换。我们将这种方法应用于局部最优块预处理共轭梯度法,以加速 Kohn-Sham 方程的大规模特征值求解器。我们通过仅使用 Kohn-Sham 哈密顿量的动能算子来检查特征值求解器上的收敛模式,确定了切换每种精度方案的适当阈值。结果,我们在 NVIDIA GPU 上的各种边界条件下的测试系统中,分别实现了能带结构和自洽场计算的高达 8.53×和 6.60×的加速。