Christensen Anders S, Bratholm Lars A, Faber Felix A, Anatole von Lilienfeld O
Department of Chemistry, National Center for Computational Design and Discovery of Novel Materials (MARVEL), Institute of Physical Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.
School of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom.
J Chem Phys. 2020 Jan 31;152(4):044107. doi: 10.1063/1.5126701.
We introduce the FCHL19 representation for atomic environments in molecules or condensed-phase systems. Machine learning models based on FCHL19 are able to yield predictions of atomic forces and energies of query compounds with chemical accuracy on the scale of milliseconds. FCHL19 is a revision of our previous work [F. A. Faber et al., J. Chem. Phys. 148, 241717 (2018)] where the representation is discretized and the individual features are rigorously optimized using Monte Carlo optimization. Combined with a Gaussian kernel function that incorporates elemental screening, chemical accuracy is reached for energy learning on the QM7b and QM9 datasets after training for minutes and hours, respectively. The model also shows good performance for non-bonded interactions in the condensed phase for a set of water clusters with a mean absolute error (MAE) binding energy error of less than 0.1 kcal/mol/molecule after training on 3200 samples. For force learning on the MD17 dataset, our optimized model similarly displays state-of-the-art accuracy with a regressor based on Gaussian process regression. When the revised FCHL19 representation is combined with the operator quantum machine learning regressor, forces and energies can be predicted in only a few milliseconds per atom. The model presented herein is fast and lightweight enough for use in general chemistry problems as well as molecular dynamics simulations.
我们引入了用于分子或凝聚相系统中原子环境的FCHL19表示法。基于FCHL19的机器学习模型能够在毫秒级尺度上以化学精度预测查询化合物的原子力和能量。FCHL19是我们之前工作[F. A. Faber等人,《化学物理杂志》148, 241717 (2018)]的修订版,在该工作中表示法被离散化,并且使用蒙特卡罗优化对各个特征进行了严格优化。结合包含元素筛选的高斯核函数,分别在对QM7b和QM9数据集进行数分钟和数小时的训练后,能量学习达到了化学精度。对于一组水簇,该模型在凝聚相中的非键相互作用方面也表现出良好的性能,在3200个样本上进行训练后,平均绝对误差(MAE)结合能误差小于0.1 kcal/mol/分子。对于MD17数据集上的力学习,我们优化后的模型同样基于高斯过程回归的回归器显示出了最先进的精度。当修订后的FCHL19表示法与算子量子机器学习回归器相结合时,每个原子仅需几毫秒就能预测力和能量。本文提出的模型足够快速且轻量级,可用于一般化学问题以及分子动力学模拟。