Bounaceur Roda, Paes Francisco, Privat Romain, Jaubert Jean-Noël
Université de Lorraine, CNRS, LRGP, F-54000, Nancy, France.
J Cheminform. 2025 Aug 29;17(1):132. doi: 10.1186/s13321-025-01062-9.
In this paper, we propose a robust deep-learning model based on a Quantitative Structure - Property Relationship (QSPR) approach for estimating the critical temperature (TC), critical pressure (PC), acentric factor (ACEN) and normal boiling point (NBP) of any C, H, O, N, S, P, F, Cl, Br, I molecule. The Mordred calculator was used to determine 247 descriptors to characterize the molecules considered in this work. For each evaluated property, multiple neural networks were trained within a bagging framework. The predictions from the final ensemble were successfully tested against a large set of experimental data comprising more than 1700 molecules and compared with those from different recent learning models found in the literature. Comprehensive comparisons and extensive testing highlight the robustness and predictive power of the newly proposed multimodal learning model. The developed prediction tool is available on a website at https://lrgp-thermoppt.streamlit.app/ . Furthermore, a source code for implementing the trained models in Python is available via github https://github.com/bounac80/AI-ThermPpt .
在本文中,我们提出了一种基于定量结构-性质关系(QSPR)方法的稳健深度学习模型,用于估算任何由碳、氢、氧、氮、硫、磷、氟、氯、溴、碘组成的分子的临界温度(TC)、临界压力(PC)、偏心因子(ACEN)和正常沸点(NBP)。使用Mordred计算器确定了247个描述符,以表征本研究中所考虑的分子。对于每个评估的性质,在装袋框架内训练了多个神经网络。最终集成模型的预测结果成功地针对包含1700多个分子的大量实验数据进行了测试,并与文献中不同的近期学习模型的预测结果进行了比较。全面的比较和广泛的测试突出了新提出的多模态学习模型的稳健性和预测能力。所开发的预测工具可在网站https://lrgp-thermoppt.streamlit.app/上获取。此外,通过github https://github.com/bounac80/AI-ThermPpt可获得在Python中实现训练模型的源代码。