Hou Yi-Fan, Zhang Lina, Zhang Quanhao, Ge Fuchun, Dral Pavlo O
State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China.
Institute of Physics, Faculty of Physics, Astronomy, and Informatics, Nicolaus Copernicus University in Toruń, ul. Grudziądzka 5, Toruń 87-100, Poland.
J Chem Theory Comput. 2024 Sep 12. doi: 10.1021/acs.jctc.4c00821.
Quantum chemical simulations can be greatly accelerated by constructing machine learning potentials, which is often done using active learning (AL). The usefulness of the constructed potentials is often limited by the high effort required and their insufficient robustness in the simulations. Here, we introduce the end-to-end AL for constructing robust data-efficient potentials with affordable investment of time and resources and minimum human interference. Our AL protocol is based on the physics-informed sampling of training points, automatic selection of initial data, uncertainty quantification, and convergence monitoring. The versatility of this protocol is shown in our implementation of quasi-classical molecular dynamics for simulating vibrational spectra, conformer search of a key biochemical molecule, and time-resolved mechanism of the Diels-Alder reaction. These investigations took us days instead of weeks of pure quantum chemical calculations on a high-performance computing cluster.
通过构建机器学习势,量子化学模拟可以得到极大加速,这通常是使用主动学习(AL)来完成的。所构建势的有用性常常受到所需的高工作量以及它们在模拟中不足的稳健性的限制。在此,我们引入端到端主动学习,以用可承受的时间和资源投入以及最小的人为干预构建稳健且数据高效的势。我们的主动学习协议基于训练点的物理信息采样、初始数据的自动选择、不确定性量化和收敛监测。该协议的通用性在我们用于模拟振动光谱的准经典分子动力学、一个关键生化分子的构象搜索以及狄尔斯 - 阿尔德反应的时间分辨机制的实现中得到了展示。这些研究在高性能计算集群上花费了我们数天时间,而不是进行数周的纯量子化学计算。