Fleck Maximilian, Darouich Samir, Spera Marcelle B M, Hansen Niels
Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Pfaffenwaldring 9, 70569, Stuttgart, Germany.
Institute for Artificial Intelligence, University of Stuttgart, Universitätsstraße 32, 70569, Stuttgart, Germany.
J Cheminform. 2025 Aug 28;17(1):131. doi: 10.1186/s13321-025-01070-9.
When data availability is limited, the prediction of properties through purely data-driven machine learning (ML) is challenging. Integrating physically-based modeling techniques into ML methods may lead to better performance. In a recent work by Chew et al. ("Advancing material property prediction: using physics-informed machine learning models for viscosity") descriptors from classical molecular dynamics (MD) simulations were included into a quantitative structure-property relationship to accurately predict temperature-dependent viscosity of pure liquids. Through feature importance analysis, the authors found that heat of vaporization was the most relevant descriptor for the prediction of viscosity. In this comment, we would like to discuss the physical origin of this finding by referring to Eyring's rate theory, and develop an alternative modeling approach using a thermodynamic-based architecture that requires less input data.
当数据可用性有限时,通过纯数据驱动的机器学习(ML)来预测属性具有挑战性。将基于物理的建模技术集成到ML方法中可能会带来更好的性能。在Chew等人最近的一项工作中(“推进材料属性预测:使用物理信息机器学习模型预测粘度”),经典分子动力学(MD)模拟的描述符被纳入定量结构-属性关系中,以准确预测纯液体的温度依赖性粘度。通过特征重要性分析,作者发现汽化热是预测粘度最相关的描述符。在本评论中,我们将通过参考艾林速率理论来讨论这一发现的物理起源,并开发一种使用基于热力学的架构的替代建模方法,该方法需要较少的输入数据。