Suppr超能文献

聚酰亚胺介电常数的可解释机器学习预测:一种经实验验证的特征工程方法。

Interpretable Machine Learning Prediction of Polyimide Dielectric Constants: A Feature-Engineered Approach with Experimental Validation.

作者信息

He Xiaojie, Wan Jiachen, Zhang Songyang, Zhang Chenggang, Xiao Peng, Zheng Feng, Lu Qinghua

机构信息

School of Chemical Science and Engineering, Tongji University, Siping Road No. 1239, Shanghai 200092, China.

Institute of Micro/Nano Materials and Devices, Ningbo University of Technology, Fenghua Road No. 201, Ningbo 315211, China.

出版信息

Polymers (Basel). 2025 Jun 11;17(12):1622. doi: 10.3390/polym17121622.

Abstract

Low-dielectric polyimides (PIs) have emerged as essential materials for next-generation microelectronics and communication technologies, yet traditional experimental and theoretical calculation methods for acquiring dielectric constant data face challenges in cost, accuracy, and scalability. This study presents a machine learning (ML) framework that combines polymer domain knowledge with advanced data-driven modeling techniques for accurate prediction of PI dielectric constants at 1 kHz. A dataset of 439 PIs was constructed, and 208 molecular descriptors were derived from SMILES-encoded structures. Through rigorous feature engineering-variance filtering, correlation analysis, and recursive feature elimination-10 key descriptors were identified, capturing electronic and polar interaction, surface area, and structural complexity. Five ML algorithms were evaluated, with Gaussian Process Regression (GPR) achieving superior predictive accuracy (test set: R = 0.90, RMSE = 0.10). Shapley additive explanations (SHAP) analysis quantifies the contribution of molecular descriptors to PI dielectric constants. By means of SHAP values, it discloses the positive or negative impacts of descriptors on the predictions. Three novel PIs were synthesized for experimental validation, showing strong agreement between predicted and measured dielectric constants (mean percentage deviation: 2.24%). The model demonstrates robust predictions for other structurally similar polymers but reveals a 40% accuracy reduction (R = 0.60) in 10 GHz cross-frequency predictions, emphasizing the requirement for multi-frequency training datasets to enhance model generalizability. This work advances the research paradigm of polymer dielectric materials and provides a pathway for the rational design of materials guided by machine learning.

摘要

低介电常数聚酰亚胺(PI)已成为下一代微电子和通信技术的关键材料,然而,用于获取介电常数数据的传统实验和理论计算方法在成本、准确性和可扩展性方面面临挑战。本研究提出了一种机器学习(ML)框架,该框架将聚合物领域知识与先进的数据驱动建模技术相结合,用于精确预测1kHz下PI的介电常数。构建了一个包含439种PI的数据集,并从SMILES编码结构中导出了208个分子描述符。通过严格的特征工程——方差过滤、相关性分析和递归特征消除,确定了10个关键描述符,这些描述符反映了电子和极性相互作用、表面积和结构复杂性。评估了五种ML算法,其中高斯过程回归(GPR)实现了卓越的预测准确性(测试集:R = 0.90,RMSE = 0.10)。Shapley加法解释(SHAP)分析量化了分子描述符对PI介电常数的贡献。借助SHAP值,它揭示了描述符对预测的正面或负面影响。合成了三种新型PI用于实验验证,预测的介电常数与测量值之间显示出高度一致性(平均百分比偏差:2.24%)。该模型对其他结构相似的聚合物表现出稳健的预测能力,但在10GHz交叉频率预测中准确率降低了40%(R = 0.60),这强调了需要多频率训练数据集来提高模型的通用性。这项工作推动了聚合物介电材料的研究范式,并为机器学习指导下的材料合理设计提供了一条途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c724/12197143/904d9e1a0c6e/polymers-17-01622-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验