结合基团贡献概念和图神经网络实现可解释的分子性质模型

Combining Group-Contribution Concept and Graph Neural Networks Toward Interpretable Molecular Property Models.

机构信息

Process and Systems Engineering Center (PROSYS), Department of Chemical and Biochemical Engineering, Technical University of Denmark, Kgs. LyngbyDK-2800, Denmark.

出版信息

J Chem Inf Model. 2023 Feb 13;63(3):725-744. doi: 10.1021/acs.jcim.2c01091. Epub 2023 Jan 30.

DOI:10.1021/acs.jcim.2c01091

PMID:36716461

Abstract

Quantitative structure-property relationships (QSPRs) are important tools to facilitate and accelerate the discovery of compounds with desired properties. While many QSPRs have been developed, they are associated with various shortcomings such as a lack of generalizability and modest accuracy. Albeit various machine-learning and deep-learning techniques have been integrated into such models, another shortcoming has emerged in the form of a lack of transparency and interpretability of such models. In this work, two interpretable graph neural network (GNN) models (attentive group-contribution (AGC) and group-contribution-based graph attention (GroupGAT)) are developed by integrating fundamentals using the concept of group contributions (GC). The interpretability consists of highlighting the substructure with the highest attention weights in the latent representation of the molecules using the attention mechanism. The proposed models showcased better performance compared to classical group-contribution models, as well as against various other GNN models describing the aqueous solubility, melting point, and enthalpies of formation, combustion, and fusion of organic compounds. The insights provided are consistent with insights obtained from the semiempirical GC models confirming that the proposed framework allows highlighting the important substructures of the molecules for a specific property.

摘要

定量构效关系（QSPR）是促进和加速发现具有所需性质的化合物的重要工具。虽然已经开发了许多 QSPR，但它们存在各种缺点，例如缺乏通用性和适度的准确性。尽管已经将各种机器学习和深度学习技术集成到这些模型中，但这些模型的另一个缺点是缺乏透明度和可解释性。在这项工作中，通过使用基团贡献（GC）的概念集成基础，开发了两个可解释的图神经网络（GNN）模型（注意基团贡献（AGC）和基于基团贡献的图注意力（GroupGAT））。可解释性包括使用注意力机制突出分子潜在表示中具有最高注意力权重的子结构。与描述有机化合物的水溶液溶解度、熔点、生成焓、燃烧和熔融的经典基团贡献模型以及其他各种 GNN 模型相比，所提出的模型表现出更好的性能。提供的见解与从半经验 GC 模型获得的见解一致，证实了所提出的框架允许突出分子的重要子结构用于特定性质。

相似文献

Combining Group-Contribution Concept and Graph Neural Networks Toward Interpretable Molecular Property Models.结合基团贡献概念和图神经网络实现可解释的分子性质模型

J Chem Inf Model. 2023 Feb 13;63(3):725-744. doi: 10.1021/acs.jcim.2c01091. Epub 2023 Jan 30.

Graph Neural Tree: A novel and interpretable deep learning-based framework for accurate molecular property predictions.图神经网络树：一种新颖且可解释的基于深度学习的准确分子性质预测框架。

Anal Chim Acta. 2023 Mar 1;1244:340558. doi: 10.1016/j.aca.2022.340558. Epub 2022 Nov 3.

MD-GNN: A mechanism-data-driven graph neural network for molecular properties prediction and new material discovery.MD-GNN：一种基于机制数据的图神经网络，用于分子性质预测和新材料发现。

J Mol Graph Model. 2023 Sep;123:108506. doi: 10.1016/j.jmgm.2023.108506. Epub 2023 May 9.

Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism.基于图注意力机制的有机化合物合成可及性预测

J Chem Inf Model. 2022 Jun 27;62(12):2973-2986. doi: 10.1021/acs.jcim.2c00038. Epub 2022 Jun 8.

FP-GNN: a versatile deep learning architecture for enhanced molecular property prediction.FP-GNN：一种用于增强分子性质预测的多功能深度学习架构。

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac408.

Formula Graph Self-Attention Network for Representation-Domain Independent Materials Discovery.公式图自注意力网络用于表示域独立的材料发现。

Adv Sci (Weinh). 2022 Jun;9(18):e2200164. doi: 10.1002/advs.202200164. Epub 2022 Apr 27.

GeoDILI: A Robust and Interpretable Model for Drug-Induced Liver Injury Prediction Using Graph Neural Network-Based Molecular Geometric Representation.GeoDILI：基于图神经网络的分子几何表示的药物性肝损伤预测的稳健且可解释模型。

Chem Res Toxicol. 2023 Nov 20;36(11):1717-1730. doi: 10.1021/acs.chemrestox.3c00199. Epub 2023 Oct 15.

Benchmarking Accuracy and Generalizability of Four Graph Neural Networks Using Large In Vitro ADME Datasets from Different Chemical Spaces.使用来自不同化学空间的大型体外 ADME 数据集对四种图神经网络进行准确性和泛化能力的基准测试。

Mol Inform. 2022 Aug;41(8):e2100321. doi: 10.1002/minf.202100321. Epub 2022 Feb 23.

Interpretable-ADMET: a web service for ADMET prediction and optimization based on deep neural representation.可解释的 ADMET：基于深度神经表示的 ADMET 预测和优化的网络服务。

Bioinformatics. 2022 May 13;38(10):2863-2871. doi: 10.1093/bioinformatics/btac192.

Explainable Solvation Free Energy Prediction Combining Graph Neural Networks with Chemical Intuition.结合图神经网络与化学直觉的可解释溶剂化自由能预测

J Chem Inf Model. 2022 Nov 28;62(22):5457-5470. doi: 10.1021/acs.jcim.2c01013. Epub 2022 Nov 1.

引用本文的文献

MulAFNet: Integrating Multiple Molecular Representations for Enhanced Property Prediction.MulAFNet：整合多种分子表征以增强性质预测

ACS Omega. 2025 Mar 19;10(12):12043-12053. doi: 10.1021/acsomega.4c09884. eCollection 2025 Apr 1.

Homogeneous catalyst graph neural network: A human-interpretable graph neural network tool for ligand optimization in asymmetric catalysis.均相催化剂图神经网络：一种用于不对称催化中配体优化的可人工解释的图神经网络工具。

iScience. 2025 Jan 23;28(3):111881. doi: 10.1016/j.isci.2025.111881. eCollection 2025 Mar 21.

Predicting and Explaining Yields with Machine Learning for Carboxylated Azoles and Beyond.利用机器学习预测和解释羧基化唑类及其他物质的产率

J Chem Inf Model. 2025 Feb 24;65(4):1862-1872. doi: 10.1021/acs.jcim.4c02336. Epub 2025 Feb 7.

HANNA: hard-constraint neural network for consistent activity coefficient prediction.汉纳：用于一致活度系数预测的硬约束神经网络。

Chem Sci. 2024 Oct 31;15(47):19777-19786. doi: 10.1039/d4sc05115g. eCollection 2024 Dec 4.

FGTN: Fragment-based graph transformer network for predicting reproductive toxicity.基于片段的图变换网络预测生殖毒性

Arch Toxicol. 2024 Dec;98(12):4077-4092. doi: 10.1007/s00204-024-03866-4. Epub 2024 Sep 18.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

结合基团贡献概念和图神经网络实现可解释的分子性质模型

Combining Group-Contribution Concept and Graph Neural Networks Toward Interpretable Molecular Property Models.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献