Pozdnyakov Sergey N, Willatt Michael J, Bartók Albert P, Ortner Christoph, Csányi Gábor, Ceriotti Michele
Laboratory of Computational Science and Modelling, Institute of Materials, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland.
Department of Physics and Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom.
Phys Rev Lett. 2020 Oct 16;125(16):166001. doi: 10.1103/PhysRevLett.125.166001.
Many-body descriptors are widely used to represent atomic environments in the construction of machine-learned interatomic potentials and more broadly for fitting, classification, and embedding tasks on atomic structures. There is a widespread belief in the community that three-body correlations are likely to provide an overcomplete description of the environment of an atom. We produce several counterexamples to this belief, with the consequence that any classifier, regression, or embedding model for atom-centered properties that uses three- (or four)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centered contributions mitigates the impact of this fundamental deficiency-explaining the success of current "machine-learning" force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.
多体描述符在构建机器学习原子间势时被广泛用于表示原子环境,更广泛地用于原子结构的拟合、分类和嵌入任务。该领域普遍认为三体相关性可能会对原子环境提供过度完备的描述。我们给出了几个反例来反驳这一观点,结果是任何使用三体(或四体)特征的以原子为中心的属性分类器、回归模型或嵌入模型,对于不同构型都会错误地给出相同结果。将全局属性(如总能量)写成许多以原子为中心的贡献之和,可减轻这一基本缺陷的影响,这就解释了当前“机器学习”力场的成功之处。我们预计随着所需精度的提高会出现的问题,并提出了潜在的解决方案。