Suppr超能文献

基于人工智能的关键性质和沸点预测:一种混合集成学习与定量构效关系方法。

AI-powered prediction of critical properties and boiling points: a hybrid ensemble learning and QSPR approach.

作者信息

Bounaceur Roda, Paes Francisco, Privat Romain, Jaubert Jean-Noël

机构信息

Université de Lorraine, CNRS, LRGP, F-54000, Nancy, France.

出版信息

J Cheminform. 2025 Aug 29;17(1):132. doi: 10.1186/s13321-025-01062-9.

Abstract

In this paper, we propose a robust deep-learning model based on a Quantitative Structure - Property Relationship (QSPR) approach for estimating the critical temperature (TC), critical pressure (PC), acentric factor (ACEN) and normal boiling point (NBP) of any C, H, O, N, S, P, F, Cl, Br, I molecule. The Mordred calculator was used to determine 247 descriptors to characterize the molecules considered in this work. For each evaluated property, multiple neural networks were trained within a bagging framework. The predictions from the final ensemble were successfully tested against a large set of experimental data comprising more than 1700 molecules and compared with those from different recent learning models found in the literature. Comprehensive comparisons and extensive testing highlight the robustness and predictive power of the newly proposed multimodal learning model. The developed prediction tool is available on a website at https://lrgp-thermoppt.streamlit.app/ . Furthermore, a source code for implementing the trained models in Python is available via github https://github.com/bounac80/AI-ThermPpt .

摘要

在本文中,我们提出了一种基于定量结构-性质关系(QSPR)方法的稳健深度学习模型,用于估算任何由碳、氢、氧、氮、硫、磷、氟、氯、溴、碘组成的分子的临界温度(TC)、临界压力(PC)、偏心因子(ACEN)和正常沸点(NBP)。使用Mordred计算器确定了247个描述符,以表征本研究中所考虑的分子。对于每个评估的性质,在装袋框架内训练了多个神经网络。最终集成模型的预测结果成功地针对包含1700多个分子的大量实验数据进行了测试,并与文献中不同的近期学习模型的预测结果进行了比较。全面的比较和广泛的测试突出了新提出的多模态学习模型的稳健性和预测能力。所开发的预测工具可在网站https://lrgp-thermoppt.streamlit.app/上获取。此外,通过github https://github.com/bounac80/AI-ThermPpt可获得在Python中实现训练模型的源代码。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2c9/12395778/4d975c86aa20/13321_2025_1062_Fig1_HTML.jpg

相似文献

9
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
10
Impact of residual disease as a prognostic factor for survival in women with advanced epithelial ovarian cancer after primary surgery.
Cochrane Database Syst Rev. 2022 Sep 26;9(9):CD015048. doi: 10.1002/14651858.CD015048.pub2.

本文引用的文献

1
Predicting Critical Properties and Acentric Factors of Fluids Using Multitask Machine Learning.
J Chem Inf Model. 2023 Aug 14;63(15):4574-4588. doi: 10.1021/acs.jcim.3c00546. Epub 2023 Jul 24.
2
Development of QSPR-ANN models for the estimation of critical properties of pure hydrocarbons.
J Mol Graph Model. 2023 Jun;121:108450. doi: 10.1016/j.jmgm.2023.108450. Epub 2023 Mar 7.
4
Mordred: a molecular descriptor calculator.
J Cheminform. 2018 Feb 6;10(1):4. doi: 10.1186/s13321-018-0258-y.
5
PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints.
J Comput Chem. 2011 May;32(7):1466-74. doi: 10.1002/jcc.21707. Epub 2010 Dec 17.
6
Mold(2), molecular descriptors from 2D structures for chemoinformatics and toxicoinformatics.
J Chem Inf Model. 2008 Jul;48(7):1337-44. doi: 10.1021/ci800038f. Epub 2008 Jun 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验