Suppr超能文献

开发一种创新的数据驱动系统,以生成小样本集上介电常数的描述性预测方程。

Development of an innovative data-driven system to generate descriptive prediction equation of dielectric constant on small sample sets.

作者信息

Mao Jiashun, Zeb Amir, Kim Min Sung, Jeon Hyeon-Nae, Wang Jianmin, Guan Shenghui, No Kyoung Tai

机构信息

College of Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon (21983), Republic of Korea.

Department of Natural and Basic Sciences, University of Turbat, Kech, Turbat, Balochistan (92600), Pakistan.

出版信息

Heliyon. 2022 Aug 4;8(8):e10011. doi: 10.1016/j.heliyon.2022.e10011. eCollection 2022 Aug.

Abstract

Dielectric constant (DC, ε) is a fundamental parameter in material sciences to measure polarizability of the system. In industrial processes, its value is an imperative indicator, which demonstrates the dielectric property of material and compiles information including separation information, chemical equilibrium, chemical reactivity analysis, and solubility modeling. Since, the available ε-prediction models are fairly primitive and frequently suffer from serious failures especially when deals with strong polar compounds. Therefore, we have developed a novel data-driven system to improve the efficiency and wide-range applicability of ε using in material sciences. This innovative scheme adopts the correlation distance and genetic algorithm to discriminate features' combination and avoid overfitting. Herein, the prediction output of the single ML model as a coding to estimate the target value by simulating the layer-by-layer extraction in deep learning, and enabling instant search for the optimal combination of features is recruited. Our model established an improved correlation value of 0.956 with target as compared to the previously available best traditional ML result of 0.877. Our framework established a profound improvement, especially for material systems possessing ε value >50. In terms of interpretability, we have derived a conceptual computational equation from a minimum generating tree. Our innovative data-driven system is preferentially superior over other methods due to its application for the prediction of dielectric constants as well as for the prediction of overall micro and macro-properties of any multi-components complex.

摘要

介电常数(DC,ε)是材料科学中用于测量系统极化率的一个基本参数。在工业过程中,其值是一个至关重要的指标,它展示了材料的介电特性,并汇总了包括分离信息、化学平衡、化学反应性分析和溶解度建模等信息。然而,现有的ε预测模型相当原始,并且经常遭遇严重失败,尤其是在处理强极性化合物时。因此,我们开发了一种新型数据驱动系统,以提高材料科学中ε的使用效率和广泛适用性。这种创新方案采用相关距离和遗传算法来区分特征组合并避免过拟合。在此,单个机器学习(ML)模型的预测输出作为一种编码,通过模拟深度学习中的逐层提取来估计目标值,并能够即时搜索特征的最佳组合。与之前可用的最佳传统ML结果0.877相比,我们的模型与目标建立了0.956的改进相关值。我们的框架有了显著改进,特别是对于介电常数ε值>50的材料系统。在可解释性方面,我们从最小生成树推导出了一个概念性计算方程。我们创新的数据驱动系统因其在介电常数预测以及任何多组分复合物的整体微观和宏观性质预测中的应用而优先优于其他方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2569/9396556/0efb3735a41b/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验