Suppr超能文献

QMGBP-DL:一种用于量子分子图带隙预测的深度学习和机器学习方法。

QMGBP-DL: a deep learning and machine learning approach for quantum molecular graph band-gap prediction.

作者信息

Abbassi Outhman, Ziti Soumia

机构信息

IPSS, Intelligent Processing and Security of Systems, Faculty of Science, Mohammed V University in Rabat, 1014 RP, Rabat, Morocco.

出版信息

Mol Divers. 2025 Apr 19. doi: 10.1007/s11030-025-11178-7.

Abstract

Predicting molecular and quantum material properties, especially the band gap, is crucial for accelerating discoveries in drug design and material science. Although graph neural networks and probabilistic encoders are well established in molecular data analysis, their targeted integration and application for band-gap prediction remain an active research area. This paper introduces QMGBP-DL, a deep learning approach that combines a molecular graph encoder with machine learning models to improve the prediction accuracy of molecular and material band-gap energy. The encoder uses graph convolutional networks to derive latent representations of chemical structures from SMILES strings, optimized via Kullback-Leibler divergence loss. These representations serve as inputs for training various machine learning models to predict properties. QMGBP-DL's effectiveness is assessed using the QM9, PCQM4M, and OPV datasets, demonstrating significant improvements, particularly with a random forest model for property prediction. A comparative analysis against established approaches DenseGNN, MEGNet, and ALIGNN reveals that QMGBP-DL excels in predicting HOMO, LUMO, and band gap, achieving notably lower MAE values. The integration of GCN-derived latent spaces with traditional machine learning models, especially Random Forest, provides a powerful approach for band-gap prediction. The results highlight the efficacy of our integrated approach, showcasing that graph-based molecular encoding combined with machine learning, particularly Random Forest, is highly effective for accurate band-gap prediction, thereby facilitating material discovery and design.

摘要

预测分子和量子材料的性质,尤其是带隙,对于加速药物设计和材料科学领域的发现至关重要。尽管图神经网络和概率编码器在分子数据分析中已得到广泛应用,但其在带隙预测方面的针对性整合与应用仍是一个活跃的研究领域。本文介绍了QMGBP-DL,一种深度学习方法,它将分子图编码器与机器学习模型相结合,以提高分子和材料带隙能量的预测精度。该编码器使用图卷积网络从SMILES字符串中导出化学结构的潜在表示,并通过Kullback-Leibler散度损失进行优化。这些表示作为训练各种机器学习模型以预测性质的输入。使用QM9、PCQM4M和OPV数据集评估了QMGBP-DL的有效性,结果表明其有显著改进,特别是在使用随机森林模型进行性质预测时。与已有的方法DenseGNN、MEGNet和ALIGNN的对比分析表明,QMGBP-DL在预测最高占据分子轨道(HOMO)、最低未占据分子轨道(LUMO)和带隙方面表现出色,实现了显著更低的平均绝对误差(MAE)值。将图卷积网络(GCN)导出的潜在空间与传统机器学习模型(尤其是随机森林)相结合,为带隙预测提供了一种强大的方法。结果突出了我们的集成方法的有效性,表明基于图的分子编码与机器学习(特别是随机森林)相结合对于准确的带隙预测非常有效,从而有助于材料的发现和设计。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验