Perez-Lemus Gustavo, Xu Yinan, Jin Yezhi, Zubieta Rico Pablo, de Pablo Juan
Pritzker School of Molecular Engineering, The University of Chicago, Chicago, Illinois 60637, USA.
Department of Chemical and Biological Engineering, Tandon School of Engineering, Courant Department of Computer Science, and Department of Physics, New York University, New York, New York 10012, USA.
J Chem Phys. 2024 Dec 28;161(24). doi: 10.1063/5.0237399.
Machine learning interatomic potentials (MLIPs) are rapidly gaining interest for molecular modeling, as they provide a balance between quantum-mechanical level descriptions of atomic interactions and reasonable computational efficiency. However, questions remain regarding the stability of simulations using these potentials, as well as the extent to which the learned potential energy function can be extrapolated safely. Past studies have encountered challenges when MLIPs are applied to classical benchmark systems. In this work, we show that some of these challenges are related to the characteristics of the training datasets, particularly the inefficient exploration of the dynamical modes and the inclusion of rigid constraints. We demonstrate that long stability in simulations with MLIPs can be achieved by generating unconstrained datasets using unbiased classical simulations, provided that the important dynamical modes are correctly sampled. In addition, we emphasize that in order to achieve precise energy predictions, it is important to resort to enhanced sampling techniques for dataset generation, and we demonstrate that safe extrapolation of MLIPs depends on judicious choices related to the system's underlying free energy landscape and the symmetry features embedded within the machine learning models.
机器学习原子间势(MLIPs)在分子建模中迅速受到关注,因为它们在原子相互作用的量子力学水平描述和合理的计算效率之间取得了平衡。然而,关于使用这些势进行模拟的稳定性以及所学势能函数能够安全外推的程度,仍然存在问题。过去的研究在将MLIPs应用于经典基准系统时遇到了挑战。在这项工作中,我们表明其中一些挑战与训练数据集的特征有关,特别是对动力学模式的低效探索以及刚性约束的纳入。我们证明,通过使用无偏经典模拟生成无约束数据集,只要正确采样重要的动力学模式,就可以在使用MLIPs的模拟中实现长时间稳定性。此外,我们强调,为了实现精确的能量预测,采用增强采样技术生成数据集很重要,并且我们证明MLIPs的安全外推取决于与系统潜在自由能景观以及机器学习模型中嵌入的对称特征相关的明智选择。