Nebot-Gil Ignacio
Institute of Molecular Science, University of Valencia , c/Catedrático José Beltrán 2 E-46980-Paterna (Valencia), Spain.
J Chem Theory Comput. 2015 Feb 10;11(2):472-83. doi: 10.1021/ct500689u.
On the basis of a dressed matrices formalism, a new algorithm has been devised for obtaining the lowest eigenvalue and the corresponding eigenvector of large real symmetric matrices. Given an N × N matrix, the proposed algorithm consists in the diagonalization of (N - 1)2 × 2 dressed matrices. Both sequential and parallel versions of the proposed algorithm have been implemented. Tests have been performed on a Hilbert matrix, and the results show that this algorithm is up 340 times faster than the corresponding LAPACK routine for N = 10(4) and about 10% faster than the Davidson method. The parallel MPI version has been tested using up to 512 nodes. The speed-up for a N = 10(6) matrix is fairly lineal until 64 cores. The time necessary to obtain the lowest eigenvalue and eigenvector is nearly 5.5 min with 512 cores. For an N = 10(7) matrix, the speed-up is nearly linear to 256 cores and the calculation time is 5.2 h with 512 nodes. Finally, in order to test the new algorithm on MRCI matrices, we have calculated the ground state and the π → π* excited state of the butadiene molecule, starting from both SCF and CASSCF wave functions. In all the cases considered, correlation energies and wave functions are the same as obtained with the Davidson algorithm.
基于修饰矩阵形式体系,设计了一种新算法来获取大型实对称矩阵的最低特征值及相应的特征向量。对于一个(N×N)矩阵,该算法通过对((N - 1))个(2×2)修饰矩阵进行对角化来实现。已实现了该算法的顺序版本和并行版本。在希尔伯特矩阵上进行了测试,结果表明,对于(N = 10^4),该算法比相应的LAPACK例程快340倍,比戴维森方法快约10%。并行MPI版本已在多达512个节点上进行了测试。对于(N = 10^6)的矩阵,直到64个核心时加速比相当线性。使用512个核心获得最低特征值和特征向量所需的时间约为5.5分钟。对于(N = 10^7)的矩阵,加速比在256个核心之前几乎是线性的,使用512个节点时计算时间为5.2小时。最后,为了在MRCI矩阵上测试新算法,我们从SCF和CASSCF波函数出发,计算了丁二烯分子的基态和(π→π*)激发态。在所有考虑的情况下,相关能和波函数与使用戴维森算法得到的结果相同。