Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44801 Bochum, Germany.
J Chem Phys. 2018 Jun 28;148(24):241730. doi: 10.1063/1.5024611.
Machine learning of atomic-scale properties is revolutionizing molecular modeling, making it possible to evaluate inter-atomic potentials with first-principles accuracy, at a fraction of the costs. The accuracy, speed, and reliability of machine learning potentials, however, depend strongly on the way atomic configurations are represented, i.e., the choice of descriptors used as input for the machine learning method. The raw Cartesian coordinates are typically transformed in "fingerprints," or "symmetry functions," that are designed to encode, in addition to the structure, important properties of the potential energy surface like its invariances with respect to rotation, translation, and permutation of like atoms. Here we discuss automatic protocols to select a number of fingerprints out of a large pool of candidates, based on the correlations that are intrinsic to the training data. This procedure can greatly simplify the construction of neural network potentials that strike the best balance between accuracy and computational efficiency and has the potential to accelerate by orders of magnitude the evaluation of Gaussian approximation potentials based on the smooth overlap of atomic positions kernel. We present applications to the construction of neural network potentials for water and for an Al-Mg-Si alloy and to the prediction of the formation energies of small organic molecules using Gaussian process regression.
机器学习在原子尺度性质方面的应用正在彻底改变分子建模,使得以第一性原理的精度、以更小的成本来评估原子间势成为可能。然而,机器学习势的准确性、速度和可靠性强烈依赖于原子构型的表示方式,即作为机器学习方法输入的描述符的选择。原始笛卡尔坐标通常被转换为“指纹”或“对称函数”,这些函数旨在除了结构之外,还编码势能面的重要性质,例如相对于旋转、平移和相同原子的置换的不变性。在这里,我们讨论了基于训练数据固有的相关性,从大量候选者中选择一些指纹的自动方案。该过程可以大大简化神经网络势的构建,在准确性和计算效率之间取得最佳平衡,并有可能将基于原子位置核平滑重叠的高斯逼近势的评估速度提高几个数量级。我们将该方法应用于水和 Al-Mg-Si 合金的神经网络势的构建,以及使用高斯过程回归预测小分子的形成能。