Ng Wai-Pan, Liang Qiujiang, Yang Jun
Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China.
Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China.
J Chem Theory Comput. 2023 Aug 22;19(16):5439-5449. doi: 10.1021/acs.jctc.3c00518. Epub 2023 Jul 28.
Accurate ab initio prediction of electronic energies is very expensive for macromolecules by explicitly solving post-Hartree-Fock equations. We here exploit the physically justified local correlation feature in a compact basis of small molecules and construct an expressive low-data deep neural network (dNN) model to obtain machine-learned electron correlation energies on par with MP2 and CCSD levels of theory for more complex molecules and different datasets that are not represented in the training set. We show that our dNN-powered model is data efficient and makes highly transferable predictions across alkanes of various lengths, organic molecules with non-covalent and biomolecular interactions, as well as water clusters of different sizes and morphologies. In particular, by training 800 (HO) clusters with the local correlation descriptors, accurate MP2/cc-pVTZ correlation energies up to (HO) can be predicted with a small random error within chemical accuracy from exact values, while a majority of prediction deviations are attributed to an intrinsically systematic error. Our results reveal that an extremely compact local correlation feature set, which is poor for any direct post-Hartree-Fock calculations, has however a prominent advantage in reserving important electron correlation patterns for making accurate transferable predictions across distinct molecular compositions, bond types, and geometries.
通过显式求解后哈特里-福克方程来精确地从头算大分子的电子能量成本非常高。我们在此利用小分子紧凑基组中具有物理合理性的局部相关特征,并构建一个具有表现力的低数据深度神经网络(dNN)模型,以获得与MP2和CCSD理论水平相当的机器学习电子相关能量,用于更复杂的分子以及训练集中未出现的不同数据集。我们表明,我们的dNN驱动模型数据效率高,能够对各种长度的烷烃、具有非共价和生物分子相互作用的有机分子以及不同大小和形态的水团簇进行高度可转移的预测。特别是,通过使用局部相关描述符训练800个(HO)团簇,可以预测高达(HO)的精确MP2/cc-pVTZ相关能量,与精确值相比,随机误差很小,在化学精度范围内,而大多数预测偏差归因于内在的系统误差。我们的结果表明,一个极其紧凑的局部相关特征集,对于任何直接的后哈特里-福克计算来说都很差,但在保留重要的电子相关模式以对不同的分子组成、键类型和几何结构进行准确的可转移预测方面具有显著优势。