Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia.
Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory 0200, Australia.
J Chem Theory Comput. 2022 Mar 8;18(3):1607-1618. doi: 10.1021/acs.jctc.1c01264. Epub 2022 Feb 17.
Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
机器学习 (ML) 方法在预测量子力学 (QM) 性质方面取得了重大进展,朝着实现计算化学家基于结构的性质预测的圣杯迈进。与仅用结构信息编码分子的直接 ML 方法相反,在这项工作中,我们表明 QM 描述符通过将电子信息纳入描述符,可以提高二聚体相互作用能的 ML 预测的准确性和数据效率。我们提出了电子变形密度相互作用能机器学习 (EDDIE-ML) 模型,该模型将相互作用能作为 Hartree-Fock 电子变形密度的函数进行预测。我们将其性能与领先的直接 ML 方案和现代 DFT 方法进行了比较,用于预测不同电荷类型、大小和分子间分离的二聚体的相互作用能。在数据量较少的情况下,EDDIE-ML 优于其他直接 ML 方案,并且是唯一易于推广到更大、更复杂系统的模型,包括碱基对三聚体和多孔笼。密度和相互作用能之间的潜在物理联系使 EDDIE-ML 能够在与其他 ML 方法相比更少的训练数据点上达到与现代 DFT 泛函相当的准确性。