Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria.
J Chem Phys. 2018 Jun 28;148(24):241709. doi: 10.1063/1.5019667.
We introduce weighted atom-centered symmetry functions (wACSFs) as descriptors of a chemical system's geometry for use in the prediction of chemical properties such as enthalpies or potential energies via machine learning. The wACSFs are based on conventional atom-centered symmetry functions (ACSFs) but overcome the undesirable scaling of the latter with an increasing number of different elements in a chemical system. The performance of these two descriptors is compared using them as inputs in high-dimensional neural network potentials (HDNNPs), employing the molecular structures and associated enthalpies of the 133 855 molecules containing up to five different elements reported in the QM9 database as reference data. A substantially smaller number of wACSFs than ACSFs is needed to obtain a comparable spatial resolution of the molecular structures. At the same time, this smaller set of wACSFs leads to a significantly better generalization performance in the machine learning potential than the large set of conventional ACSFs. Furthermore, we show that the intrinsic parameters of the descriptors can in principle be optimized with a genetic algorithm in a highly automated manner. For the wACSFs employed here, we find however that using a simple empirical parametrization scheme is sufficient in order to obtain HDNNPs with high accuracy.
我们引入加权原子中心对称函数(wACSFs)作为化学系统几何形状的描述符,用于通过机器学习预测焓或势能等化学性质。wACSFs 基于传统的原子中心对称函数(ACSFs),但克服了后者随着化学系统中不同元素数量的增加而不良缩放的问题。使用这些描述符作为高维神经网络势(HDNNP)的输入,将它们的性能进行了比较,使用分子结构和 QM9 数据库中报告的包含多达 5 种不同元素的 133855 个分子的相关焓作为参考数据。与 ACSFs 相比,wACSFs 所需的数量要少得多,就可以获得分子结构的可比空间分辨率。同时,与大量传统 ACSFs 相比,该较小的 wACSF 集在机器学习势中具有显著更好的泛化性能。此外,我们表明,原则上可以通过遗传算法以高度自动化的方式优化描述符的固有参数。然而,对于这里使用的 wACSFs,我们发现使用简单的经验参数化方案就足以获得具有高精度的 HDNNP。