Ascencio-Medina Estefania, He Shan, Daghighi Amirreza, Iduoku Kweeni, Casanola-Martin Gerardo M, Arrasate Sonia, González-Díaz Humberto, Rasulev Bakhtiyor
Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, USA.
IKERDATA S.L., ZITEK, University of the Basque Country (UPV/EHU), Rectorate Building, 48940 Bilbao, Biscay, Spain.
Polymers (Basel). 2024 Sep 26;16(19):2731. doi: 10.3390/polym16192731.
This work is devoted to the investigation of dielectric permittivity which is influenced by electronic, ionic, and dipolar polarization mechanisms, contributing to the material's capacity to store electrical energy. In this study, an extended dataset of 86 polymers was analyzed, and two quantitative structure-property relationship (QSPR) models were developed to predict dielectric permittivity. From an initial set of 1273 descriptors, the most relevant ones were selected using a genetic algorithm, and machine learning models were built using the Gradient Boosting Regressor (GBR). In contrast to Multiple Linear Regression (MLR)- and Partial Least Squares (PLS)-based models, the gradient boosting models excel in handling nonlinear relationships and multicollinearity, iteratively optimizing decision trees to improve accuracy without overfitting. The developed GBR models showed high coefficients of 0.938 and 0.822, for the training and test sets, respectively. An Accumulated Local Effect (ALE) technique was applied to assess the relationship between the selected descriptors-eight for the GB_A model and six for the GB_B model, and their impact on target property. ALE analysis revealed that descriptors such as TDB09m had a strong positive effect on permittivity, while MLOGP2 showed a negative effect. These results highlight the effectiveness of the GBR approach in predicting the dielectric properties of polymers, offering improved accuracy and interpretability.
这项工作致力于研究介电常数,其受电子、离子和偶极极化机制的影响,这有助于材料存储电能的能力。在本研究中,分析了一个包含86种聚合物的扩展数据集,并开发了两个定量结构-性质关系(QSPR)模型来预测介电常数。从最初的1273个描述符中,使用遗传算法选择了最相关的描述符,并使用梯度提升回归器(GBR)构建了机器学习模型。与基于多元线性回归(MLR)和偏最小二乘法(PLS)的模型相比,梯度提升模型在处理非线性关系和多重共线性方面表现出色,通过迭代优化决策树来提高准确性而不会过拟合。所开发的GBR模型在训练集和测试集上分别显示出0.938和0.822的高系数。应用累积局部效应(ALE)技术来评估所选描述符(GB_A模型有8个,GB_B模型有6个)与它们对目标性质的影响之间的关系。ALE分析表明,诸如TDB09m之类的描述符对介电常数有很强的正向影响,而MLOGP2则显示出负向影响。这些结果突出了GBR方法在预测聚合物介电性能方面的有效性,提供了更高的准确性和可解释性。