Xu Lifeng, Jiang Jian
Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Polymer Physics and Chemistry, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, P. R. China.
University of Chinese Academy of Sciences, Beijing 100049, P. R. China.
J Chem Theory Comput. 2024 Sep 12. doi: 10.1021/acs.jctc.4c00618.
Machine-learning force fields have achieved significant strides in accurately reproducing the potential energy surface with quantum chemical accuracy. However, this approach still faces several challenges, e.g., extrapolating to uncharted chemical spaces, interpreting long-range electrostatics, and mapping complex macroscopic properties. To address these issues, we advocate for a synergistic integration of physical principles and machine learning techniques within the framework of a physically informed neural network (PINN). This approach involves incorporating physical knowledge into the parameters of the neural network, coupled with an efficient global optimizer, the Tabu-Adam algorithm, proposed in this work to augment optimization under strict physical constraint. We choose the AMOEBA+ force field as the physics-based model for embedding and then train and test it using the diethylene glycol dimethyl ether (DEGDME) data set as a case study. The results reveal a breakthrough in constructing a precise and noise-robust machine learning force field. Utilizing two training sets with hundreds of samples, our model exhibits remarkable generalization and density functional theory (DFT) accuracy in describing molecular interactions and enables a precise prediction of the macroscopic properties such as the diffusion coefficient with minimal cost. This work provides valuable insight into establishing a fundamental framework of the PINN force field.
机器学习力场在以量子化学精度精确再现势能面方面取得了重大进展。然而,这种方法仍然面临一些挑战,例如外推到未知的化学空间、解释长程静电作用以及映射复杂的宏观性质。为了解决这些问题,我们主张在物理信息神经网络(PINN)的框架内将物理原理与机器学习技术进行协同整合。这种方法包括将物理知识纳入神经网络的参数中,并结合一种高效的全局优化器——本文提出的禁忌 - 亚当算法,以在严格的物理约束下增强优化。我们选择AMOEBA + 力场作为用于嵌入的基于物理的模型,然后以二甘醇二甲醚(DEGDME)数据集为例进行训练和测试。结果表明在构建精确且抗噪声的机器学习力场方面取得了突破。利用包含数百个样本的两个训练集,我们的模型在描述分子相互作用方面表现出显著的泛化能力和密度泛函理论(DFT)精度,并且能够以最小的成本精确预测扩散系数等宏观性质。这项工作为建立PINN力场的基本框架提供了有价值的见解。