Suppr超能文献

用数据科学方法估算环状烃的生成焓

Data Science Approach to Estimate Enthalpy of Formation of Cyclic Hydrocarbons.

作者信息

Yalamanchi Kiran K, Monge-Palacios M, van Oudenhoven Vincent C O, Gao Xin, Sarathy S Mani

机构信息

Physical Sciences and Engineering Division, Clean Combustion Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.

Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.

出版信息

J Phys Chem A. 2020 Aug 6;124(31):6270-6276. doi: 10.1021/acs.jpca.0c02785. Epub 2020 Jul 23.

Abstract

In spite of increasing importance of cyclic hydrocarbons in various chemical systems, studies on the fundamental properties of these compounds, such as enthalpy of formation, are still scarce. One of the reasons for this is the fact that the estimation of the thermodynamic properties of cyclic hydrocarbon species via cost-effective computational approaches, such as group additivity (GA), has several limitations and challenges. In this study, a machine learning (ML) approach is proposed using a support vector regression (SVR) algorithm to predict the standard enthalpy of formation of cyclic hydrocarbon species. The model is developed based on a thoroughly selected dataset of accurate experimental values of 192 species collected from the literature. The molecular descriptors used as input to the SVR are calculated via alvaDesc software, which computes in total 5255 features classified into 30 categories. The developed SVR model has an average error of approximately 10 kJ/mol. In comparison, the SVR model outperforms the GA approach for complex molecules and can be therefore proposed as a novel data-driven approach to estimate enthalpy values for complex cyclic species. A sensitivity analysis is also conducted to examine the relevant features that play a role in affecting the standard enthalpy of formation of cyclic species. Our species dataset is expected to be updated and expanded as new data are available to develop a more accurate SVR model with broader applicability.

摘要

尽管环烃在各种化学体系中的重要性日益增加,但对这些化合物基本性质的研究,如生成焓,仍然很少。造成这种情况的原因之一是,通过成本效益高的计算方法,如基团加和法(GA)来估算环烃物种的热力学性质存在若干局限性和挑战。在本研究中,提出了一种使用支持向量回归(SVR)算法的机器学习(ML)方法来预测环烃物种的标准生成焓。该模型基于从文献中精心挑选的192种物种的准确实验值数据集开发而成。用作SVR输入的分子描述符通过alvaDesc软件计算得出,该软件总共计算了5255个特征,分为30类。所开发的SVR模型的平均误差约为10 kJ/mol。相比之下,对于复杂分子,SVR模型优于GA方法,因此可以作为一种新的数据驱动方法来估算复杂环烃物种的焓值。还进行了敏感性分析,以研究影响环烃物种标准生成焓的相关特征。随着新数据的获取,我们的物种数据集有望更新和扩展,以开发出适用性更广、更准确的SVR模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1203/7458419/9fd53b6640bf/jp0c02785_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验