Chong Sanggyu, Bigi Filippo, Grasselli Federico, Loche Philip, Kellner Matthias, Ceriotti Michele
Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
Faraday Discuss. 2025 Jan 14;256(0):322-344. doi: 10.1039/d4fd00101j.
The widespread application of machine learning (ML) to the chemical sciences is making it very important to understand how the ML models learn to correlate chemical structures with their properties, and what can be done to improve the training efficiency whilst guaranteeing interpretability and transferability. In this work, we demonstrate the wide utility of prediction rigidities, a family of metrics derived from the loss function, in understanding the robustness of ML model predictions. We show that the prediction rigidities allow the assessment of the model not only at the global level, but also on the local or the component-wise level at which the intermediate ( atomic, body-ordered, or range-separated) predictions are made. We leverage these metrics to understand the learning behavior of different ML models, and to guide efficient dataset construction for model training. We finally implement the formalism for a ML model targeting a coarse-grained system to demonstrate the applicability of the prediction rigidities to an even broader class of atomistic modeling problems.
机器学习(ML)在化学科学中的广泛应用使得理解ML模型如何学习将化学结构与其性质相关联,以及在保证可解释性和可转移性的同时如何提高训练效率变得非常重要。在这项工作中,我们展示了预测刚性(一类从损失函数导出的指标)在理解ML模型预测稳健性方面的广泛用途。我们表明,预测刚性不仅允许在全局层面评估模型,还能在进行中间(原子、体序或范围分离)预测的局部或逐分量层面评估模型。我们利用这些指标来理解不同ML模型的学习行为,并指导用于模型训练的高效数据集构建。我们最终针对一个粗粒度系统实现了ML模型的形式体系,以证明预测刚性在更广泛的原子建模问题类别中的适用性。