Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany.
J Chem Theory Comput. 2022 Aug 9;18(8):4846-4855. doi: 10.1021/acs.jctc.2c00501. Epub 2022 Jul 11.
Accurate thermochemistry is essential in many chemical disciplines, such as astro-, atmospheric, or combustion chemistry. These areas often involve fleetingly existent intermediates whose thermochemistry is difficult to assess. Whenever direct calorimetric experiments are infeasible, accurate computational estimates of relative molecular energies are required. However, high-level computations, often using coupled cluster theory, are generally resource-intensive. To expedite the process using machine learning techniques, we generated a database of energies for small organic molecules at the CCSD(T)/cc-pVDZ, CCSD(T)/aug-cc-pVDZ, and CCSD(T)/cc-pVTZ levels of theory. Leveraging the power of deep learning by employing graph neural networks, we are able to predict the effect of perturbatively included triples (T), that is, the difference between CCSD and CCSD(T) energies, with a mean absolute error of 0.25, 0.25, and 0.28 kcal mol ( of 0.998, 0.997, and 0.998) with the cc-pVDZ, aug-cc-pVDZ, and cc-pVTZ basis sets, respectively. Our models were further validated by application to three validation sets taken from the S22 Database as well as to a selection of known theoretically challenging cases.
准确的热化学在许多化学领域都至关重要,如天体化学、大气化学或燃烧化学。这些领域通常涉及短暂存在的中间体,其热化学性质难以评估。每当直接量热实验不可行时,就需要对相对分子能量进行准确的计算估计。然而,高级别的计算,通常使用耦合簇理论,通常需要大量的资源。为了使用机器学习技术加速这个过程,我们在 CCSD(T)/cc-pVDZ、CCSD(T)/aug-cc-pVDZ 和 CCSD(T)/cc-pVTZ 理论水平上为小分子生成了一个能量数据库。通过使用图神经网络利用深度学习的力量,我们能够以平均绝对误差为 0.25、0.25 和 0.28 kcal/mol(分别对应于 cc-pVDZ、aug-cc-pVDZ 和 cc-pVTZ 基组的 0.998、0.997 和 0.998)预测扰动包含三重态(T)的效果,即 CCSD 和 CCSD(T)能量之间的差异。我们的模型进一步通过应用于 S22 数据库中的三个验证集以及一系列已知理论上具有挑战性的案例进行了验证。