Schmitz Gunnar, Klinting Emil Lund, Christiansen Ove
Department of Chemistry, Aarhus Universitet, DK-8000 Aarhus, Denmark.
J Chem Phys. 2020 Aug 14;153(6):064105. doi: 10.1063/5.0015344.
We present a new iterative scheme for potential energy surface (PES) construction, which relies on both physical information and information obtained through statistical analysis. The adaptive density guided approach (ADGA) is combined with a machine learning technique, namely, the Gaussian process regression (GPR), in order to obtain the iterative GPR-ADGA for PES construction. The ADGA provides an average density of vibrational states as a physically motivated importance-weighting and an algorithm for choosing points for electronic structure computations employing this information. The GPR provides an approximation to the full PES given a set of data points, while the statistical variance associated with the GPR predictions is used to select the most important among the points suggested by the ADGA. The combination of these two methods, resulting in the GPR-ADGA, can thereby iteratively determine the PES. Our implementation, additionally, allows for incorporating derivative information in the GPR. The iterative process commences from an initial Hessian and does not require any presampling of configurations prior to the PES construction. We assess the performance on the basis of a test set of nine small molecules and fundamental frequencies computed at the full vibrational configuration interaction level. The GPR-ADGA, with appropriate settings, is shown to provide fundamental excitation frequencies of an root mean square deviation (RMSD) below 2 cm, when compared to those obtained based on a PES constructed with the standard ADGA. This can be achieved with substantial savings of 65%-90% in the number of single point calculations.
我们提出了一种用于构建势能面(PES)的新迭代方案,该方案依赖于物理信息和通过统计分析获得的信息。自适应密度引导方法(ADGA)与一种机器学习技术——高斯过程回归(GPR)相结合,以获得用于构建PES的迭代GPR - ADGA。ADGA提供振动状态的平均密度作为具有物理动机的重要性加权,并提供一种算法,用于利用此信息选择进行电子结构计算的点。GPR在给定一组数据点的情况下提供对完整PES的近似,而与GPR预测相关的统计方差用于在ADGA建议的点中选择最重要的点。这两种方法相结合,产生了GPR - ADGA,从而可以迭代地确定PES。此外,我们的实现允许在GPR中纳入导数信息。迭代过程从初始海森矩阵开始,在构建PES之前不需要对构型进行任何预采样。我们基于九个小分子的测试集以及在全振动构型相互作用水平计算的基频来评估性能。与基于标准ADGA构建的PES所获得的结果相比,在适当设置下,GPR - ADGA显示出提供均方根偏差(RMSD)低于2 cm的基频激发频率。这可以通过将单点计算数量大幅节省65% - 90%来实现。