Suppr超能文献

用于识别有机太阳能电池中高效供体-受体对的可解释人工智能和机器学习分类

Interpretable AI and Machine Learning Classification for Identifying High-Efficiency Donor-Acceptor Pairs in Organic Solar Cells.

作者信息

Siddiqui Hamza, Usmani Tahsin

机构信息

Organic PV Lab, Integral University, Lucknow 226026, India.

出版信息

ACS Omega. 2024 Jul 31;9(32):34445-34455. doi: 10.1021/acsomega.4c02157. eCollection 2024 Aug 13.

Abstract

To enhance the efficiency of organic solar cells, accurately predicting the efficiency of new pairs of donor and acceptor materials is crucial. Presently, most machine learning studies rely on regression models, which often struggle to establish clear rules for distinguishing between high- and low-performing donor-acceptor pairs. This study proposes a novel approach by integrating interpretable AI, specifically using Shapely values, with four supervised machine learning classification models, namely, support vector machines, decision trees, random forest, and gradient boosting. These models aim to identify high-efficiency donor-acceptor pairs based solely on chemical structures and to extract important features that establish general design principles for distinguishing between high- and low-efficiency pairs. For validation purposes, an unsupervised machine learning algorithm utilizing loading vectors obtained from the principal component analysis is employed to identify crucial features associated with high-efficiency donor-acceptor pairs. Interestingly, the features identified by the supervised machine learning approach were found to be a subset of those identified by the unsupervised method. Noteworthy features include the van der Waals surface area, partial equalization of orbital electronegativity, Moreau-Broto autocorrelation, and molecular substructures. Leveraging these features, a backward-working model can be developed, facilitating exploration across a wide array of materials used in organic solar cells. This innovative approach will help navigate the vast chemical compound space of donor and acceptor materials essential in creating high-efficiency organic solar cells.

摘要

为提高有机太阳能电池的效率,准确预测新的供体和受体材料对的效率至关重要。目前,大多数机器学习研究依赖回归模型,而这些模型往往难以建立区分高性能和低性能供体-受体对的明确规则。本研究提出了一种新方法,将可解释人工智能(具体使用Shapely值)与四种监督机器学习分类模型(即支持向量机、决策树、随机森林和梯度提升)相结合。这些模型旨在仅根据化学结构识别高效供体-受体对,并提取重要特征,从而建立区分高效和低效对的一般设计原则。为了进行验证,采用一种利用主成分分析获得的载荷向量的无监督机器学习算法来识别与高效供体-受体对相关的关键特征。有趣的是,发现监督机器学习方法识别的特征是无监督方法识别的特征的一个子集。值得注意的特征包括范德华表面积、轨道电负性的部分均衡、莫罗-布罗托自相关和分子子结构。利用这些特征,可以开发一个反向工作模型,便于在有机太阳能电池中使用的各种材料中进行探索。这种创新方法将有助于在创建高效有机太阳能电池所需的供体和受体材料的广阔化合物空间中导航。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9a56/11325493/65e4b71b3bd2/ao4c02157_0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验