结合机器学习与化学图论预测有机化合物的标准燃烧焓：一种策略

Prediction of Standard Combustion Enthalpy of Organic Compounds Combining Machine Learning and Chemical Graph Theory: A Strategy.

作者信息

Saviñon-Flores Fernanda, Arzola-Flores Jesús A, García-Castro Miguel A, Díaz-Sánchez Fausto, Vidal Robles Esmeralda, Maruri Valderrabano Fidel Aaron

机构信息

Facultad de Ingeniería Química de la Benemérita Universidad Autónoma de Puebla, 18 Sur y Avenue San Claudio, C.P., Puebla, Pue 72570, México.

出版信息

ACS Omega. 2025 Sep 8;10(36):41828-41848. doi: 10.1021/acsomega.5c05927. eCollection 2025 Sep 16.

DOI:10.1021/acsomega.5c05927

PMID:40978437

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12444530/

Abstract

The prediction of thermochemical properties such as the standard enthalpy of combustion is essential for the design and evaluation of energetic materials. In this study, the prediction of this thermochemical property is proposed through a QSPR strategy that combines machine learning and chemical graph theory. The data set consisted of 3477 organic compounds. SMILES codes were used for each molecule to construct their molecular graphs, from which topological indices such as Estrada, Wiener, and Gutman, as well as centrality measures, were calculated. These descriptors served as predictors in supervised learning models, with tree-based ensemble models showing the best performance. The best-performing model, random forest, achieved the following metrics on the test set: = 0.9810, = 287.5988 kJ·mol, = 0.1048, = 551.9050 kJ·mol, and = 0.1933. Interpretability analysis using SHAP confirmed that the Estrada and Gutman indices were the most influential variables in the predictions. In addition, the same random forest model was trained using 210 molecular descriptors obtained from RDKit, yielding slightly better metrics: = 0.9927, = 142.2272 kJ·mol, = 0.0484, and = 342.0464 kJ·mol, and = 0.1172. Moreover, specific models were developed for different families of compounds, achieving ≈ 0.99 in all cases. Finally, a clustering analysis using the K-Means algorithm in the space defined by the topological indices enabled the identification of latent molecular patterns, providing a novel framework for organizing and analyzing chemical space. This work demonstrates the potential of combining supervised and unsupervised learning methods with chemical graph theory to enable accurate, robust, and scalable prediction of thermochemical properties such as combustion enthalpy.

摘要

预测热化学性质（如标准燃烧焓）对于含能材料的设计和评估至关重要。在本研究中，通过结合机器学习和化学图论的定量构效关系（QSPR）策略对这种热化学性质进行预测。数据集由3477种有机化合物组成。使用SMILES编码为每个分子构建其分子图，并计算诸如Estrada、Wiener和Gutman等拓扑指数以及中心性度量。这些描述符在监督学习模型中用作预测变量，基于树的集成模型表现最佳。性能最佳的随机森林模型在测试集上实现了以下指标： = 0.9810， = 287.5988 kJ·mol， = 0.1048， = 551.9050 kJ·mol，以及 = 0.1933。使用SHAP进行的可解释性分析证实，Estrada和Gutman指数是预测中最具影响力的变量。此外，使用从RDKit获得的210个分子描述符训练了相同的随机森林模型，得到了略好的指标： = 0.9927， = 142.2272 kJ·mol， = 0.0484，以及 = 342.0464 kJ·mol， = 0.1172。此外，针对不同化合物家族开发了特定模型，在所有情况下均实现了 ≈ 0.99。最后，在由拓扑指数定义的空间中使用K-Means算法进行聚类分析，能够识别潜在的分子模式，为组织和分析化学空间提供了一个新框架。这项工作展示了将监督学习和无监督学习方法与化学图论相结合，以实现对燃烧焓等热化学性质进行准确、稳健且可扩展预测的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bfe/12444530/03aca470ab8e/ao5c05927_0001.jpg

相似文献

Prediction of Standard Combustion Enthalpy of Organic Compounds Combining Machine Learning and Chemical Graph Theory: A Strategy.结合机器学习与化学图论预测有机化合物的标准燃烧焓：一种策略

ACS Omega. 2025 Sep 8;10(36):41828-41848. doi: 10.1021/acsomega.5c05927. eCollection 2025 Sep 16.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment.基于机器学习的算法用于预测胫骨平台骨折治疗后2年和5年全膝关节置换风险的研究进展

Clin Orthop Relat Res. 2025 Mar 12. doi: 10.1097/CORR.0000000000003442.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Machine learning-based identification of key biotic and abiotic drivers of mineral weathering rate in a complex enhanced weathering experiment.在一项复杂的强化风化实验中，基于机器学习识别矿物风化速率的关键生物和非生物驱动因素。

Open Res Eur. 2025 Jul 3;5:71. doi: 10.12688/openreseurope.19252.2. eCollection 2025.

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果：一种针对特定个体见解的新型验证方法。

Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.

Plug-and-play use of tree-based methods: consequences for clinical prediction modeling.基于树的方法的即插即用：对临床预测模型的影响。

J Clin Epidemiol. 2025 Aug;184:111834. doi: 10.1016/j.jclinepi.2025.111834. Epub 2025 May 19.

本文引用的文献

Enthalpy of Formation of the Energetic Materials Possessing Dinitromethyl and Trinitromethyl Functional Groups: Combined Quantum Chemical Composite and Isodesmic Reaction Approach.含二硝基甲基和三硝基甲基官能团的含能材料的生成焓：量子化学复合方法与等键反应方法联用

ACS Omega. 2025 May 21;10(21):21985-21993. doi: 10.1021/acsomega.5c02042. eCollection 2025 Jun 3.

QSPR modeling of some COVID-19 drugs using neighborhood eccentricity-based topological indices: A comparative analysis.基于邻域偏心率拓扑指数的部分新冠病毒药物的定量构效关系建模：对比分析

PLoS One. 2025 May 20;20(5):e0321359. doi: 10.1371/journal.pone.0321359. eCollection 2025.

QSPR analysis of physico-chemical and pharmacological properties of medications for Parkinson's treatment utilizing neighborhood degree-based topological descriptors.利用基于邻域度的拓扑描述符对帕金森病治疗药物的物理化学和药理性质进行定量构效关系分析。

Sci Rep. 2025 May 15;15(1):16941. doi: 10.1038/s41598-025-00898-3.

Understanding Trigger Linkage Dynamics in Energetic Materials Using Mixed Picramide Nitrate Ester Explosives.使用混合苦味酰胺硝酸酯炸药理解含能材料中的触发键合动力学。

J Phys Chem Lett. 2025 Jan 16;16(2):579-586. doi: 10.1021/acs.jpclett.4c03306. Epub 2025 Jan 7.

Predicting enthalpy of formation of benzenoid hydrocarbons and ordering molecular trees using general multiplicative Zagreb indices.使用广义乘法 Zagreb 指数预测苯型烃的生成焓并对分子树进行排序。

Heliyon. 2024 May 15;10(10):e30913. doi: 10.1016/j.heliyon.2024.e30913. eCollection 2024 May 30.

Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development.SHAP 分析实用指南：在药物研发中解释有监督机器学习模型预测。

Clin Transl Sci. 2024 Nov;17(11):e70056. doi: 10.1111/cts.70056.

Cross-validation: what does it estimate and how well does it do it?交叉验证：它估计的是什么，效果如何？

J Am Stat Assoc. 2024;119(546):1434-1445. doi: 10.1080/01621459.2023.2197686. Epub 2023 May 15.

Comparative study of Sombor index and its various versions using regression models for top priority polycyclic aromatic hydrocarbons.使用回归模型对多环芳烃进行优先级排序时，索姆博尔指数及其不同版本的比较研究。

Sci Rep. 2024 Aug 27;14(1):19841. doi: 10.1038/s41598-024-69442-z.

Enhancing Molecular Energy Predictions with Physically Constrained Modifications to the Neural Network Potential.通过对神经网络势进行物理约束修改来增强分子能量预测

J Chem Theory Comput. 2024 Jun 11;20(11):4533-4544. doi: 10.1021/acs.jctc.3c01181. Epub 2024 Jun 3.

Chemoinformatic regression methods and their applicability domain.化学信息学回归方法及其适用域。

Mol Inform. 2024 Jul;43(7):e202400018. doi: 10.1002/minf.202400018. Epub 2024 May 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

结合机器学习与化学图论预测有机化合物的标准燃烧焓：一种策略

Prediction of Standard Combustion Enthalpy of Organic Compounds Combining Machine Learning and Chemical Graph Theory: A Strategy.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献