F Dos Santos Luan G, Nebgen Benjamin T, Allen Alice E A, Hamilton Brenden W, Matin Sakib, Smith Justin S, Messerly Richard A
Department of Chemistry and Biochemistry, Texas Tech University, Lubbock, Texas 79409, United States.
Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.
J Chem Inf Model. 2025 Feb 10;65(3):1198-1210. doi: 10.1021/acs.jcim.4c01847. Epub 2025 Jan 28.
In the field of computational chemistry, predicting bond dissociation energies (BDEs) presents well-known challenges, particularly due to the multireference character of reactive systems. Many chemical reactions involve configurations where single-reference methods fall short, as the electronic structure can significantly change during bond breaking. As generating training data for partially broken bonds is a challenging task, even state-of-the-art reactive machine learning interatomic potentials (MLIPs) often fail to predict reliable BDEs and smooth dissociation curves. By contrast, simple and inexpensive physics-based models, such as the well-established Morse potential, do not suffer from any such limitations. This work leverages the Morse potential to improve reactive MLIPs by augmenting the training data set with inexpensive Morse data along the dissociation pathways. This physics-constrained data augmentation (PCDA) approach results in MLIPs with smooth bond dissociation curves as well as near coupled-cluster level BDEs, all without requiring any expensive multireference quantum mechanical calculations. A case study for methane combustion demonstrates how the PCDA approach can improve an existing reactive MLIP, namely, ANI-1xnr. Not only are the BDEs and bond dissociation curves for all radicals and molecules significantly improved compared to ANI-1xnr but the PCDA-trained MLIP retains the reliability of ANI-1xnr when performing reactive molecular dynamics simulations.
在计算化学领域,预测键解离能(BDEs)存在诸多众所周知的挑战,尤其是由于反应体系具有多参考特征。许多化学反应涉及单参考方法失效的构型,因为在键断裂过程中电子结构会发生显著变化。由于生成部分断裂键的训练数据是一项具有挑战性的任务,即使是最先进的反应性机器学习原子间势(MLIPs)也常常无法预测可靠的BDEs和平滑的解离曲线。相比之下,简单且成本低廉的基于物理的模型,如成熟的莫尔斯势,不存在此类限制。这项工作利用莫尔斯势,通过沿解离路径用低成本的莫尔斯数据扩充训练数据集来改进反应性MLIPs。这种物理约束数据增强(PCDA)方法能得到具有平滑键解离曲线以及接近耦合簇水平BDEs的MLIPs,且无需任何昂贵的多参考量子力学计算。甲烷燃烧的案例研究展示了PCDA方法如何改进现有的反应性MLIP,即ANI - 1xnr。与ANI - 1xnr相比,所有自由基和分子的BDEs及键解离曲线都得到了显著改善,而且经PCDA训练的MLIP在进行反应性分子动力学模拟时保留了ANI - 1xnr的可靠性。