基于活性悬崖的分子特征归因方法的基准测试。

Benchmarking Molecular Feature Attribution Methods with Activity Cliffs.

机构信息

Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, 8093 Zurich, Switzerland.

Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397 Biberach an der Riss, Germany.

出版信息

J Chem Inf Model. 2022 Jan 24;62(2):274-283. doi: 10.1021/acs.jcim.1c01163. Epub 2022 Jan 12.

DOI:10.1021/acs.jcim.1c01163

PMID:35019265

Abstract

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, graph-neural-network alternatives. The provided benchmark data are fully open sourced, which we hope will facilitate the testing of newly developed molecular feature attribution techniques.

摘要

特征归因技术是可解释人工智能工具包中的热门选择，因为它们可以帮助阐明基础监督学习方法所使用的输入的哪些部分与特定预测相关。在分子设计的背景下，这些方法通常涉及分子图的着色，向药物化学家展示这些分子图对于做出合成或优先考虑哪些化合物的决策很有用。突出部分与专家背景知识的一致性有望有助于理解药物设计中的机器学习模型。然而，对这种着色方法的定量评估迄今为止仅限于子结构识别任务。我们在这里提出了一种基于最大公共子结构算法应用于实验确定的活性悬崖的方法。使用提出的基准，我们发现与更现代的图神经网络替代方案相比，结合经典机器学习模型的分子着色方法往往表现更好。提供的基准数据是完全开源的，我们希望这将有助于测试新开发的分子特征归因技术。

相似文献

Benchmarking Molecular Feature Attribution Methods with Activity Cliffs.基于活性悬崖的分子特征归因方法的基准测试。

J Chem Inf Model. 2022 Jan 24;62(2):274-283. doi: 10.1021/acs.jcim.1c01163. Epub 2022 Jan 12.

Coloring Molecules with Explainable Artificial Intelligence for Preclinical Relevance Assessment.用可解释的人工智能为临床前相关性评估给分子上色。

J Chem Inf Model. 2021 Mar 22;61(3):1083-1094. doi: 10.1021/acs.jcim.0c01344. Epub 2021 Feb 25.

Artificial intelligence to deep learning: machine intelligence approach for drug discovery.人工智能到深度学习：药物发现的机器智能方法。

Mol Divers. 2021 Aug;25(3):1315-1360. doi: 10.1007/s11030-021-10217-3. Epub 2021 Apr 12.

Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.利用部分标记的噪声学生自训练和自监督图嵌入探索化学空间。

BMC Bioinformatics. 2022 May 2;23(Suppl 3):158. doi: 10.1186/s12859-022-04681-3.

Explaining compound activity predictions with a substructure-aware loss for graph neural networks.利用图神经网络的子结构感知损失解释化合物活性预测

J Cheminform. 2023 Jul 25;15(1):67. doi: 10.1186/s13321-023-00733-9.

Explaining protein-protein interactions with knowledge graph-based semantic similarity.用基于知识图的语义相似度解释蛋白质-蛋白质相互作用。

Comput Biol Med. 2024 Mar;170:108076. doi: 10.1016/j.compbiomed.2024.108076. Epub 2024 Feb 1.

Identification of vital chemical information via visualization of graph neural networks.通过图神经网络可视化识别重要化学信息。

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac577.

Comparing Explanations of Molecular Machine Learning Models Generated with Different Methods for the Calculation of Shapley Values.比较使用不同方法计算Shapley值生成的分子机器学习模型的解释

Mol Inform. 2025 Mar;44(3):e202500067. doi: 10.1002/minf.202500067.

Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks.基于深度神经网络的生物医学成像可解释人工智能技术综述。

Comput Biol Med. 2023 Apr;156:106668. doi: 10.1016/j.compbiomed.2023.106668. Epub 2023 Feb 18.

Data Integration Using Advances in Machine Learning in Drug Discovery and Molecular Biology.利用机器学习进展进行药物发现和分子生物学中的数据整合

Methods Mol Biol. 2021;2190:167-184. doi: 10.1007/978-1-0716-0826-5_7.

引用本文的文献

ACtriplet: An improved deep learning model for activity cliffs prediction by in tegrating triplet loss and pre-training.AC三元组：一种通过整合三元组损失和预训练来改进的用于活动悬崖预测的深度学习模型。

J Pharm Anal. 2025 Aug;15(8):101317. doi: 10.1016/j.jpha.2025.101317. Epub 2025 Apr 21.

ACES-GNN: can graph neural network learn to explain activity cliffs?ACES-GNN：图神经网络能学会解释活性断崖吗？

Digit Discov. 2025 Jun 30. doi: 10.1039/d5dd00012b.

Activity Cliff-Informed Contrastive Learning for Molecular Property Prediction.用于分子性质预测的基于活性悬崖的对比学习

Res Sq. 2024 Dec 4:rs.3.rs-2988283. doi: 10.21203/rs.3.rs-2988283/v2.

Integrating Explainability into Graph Neural Network Models for the Prediction of X-ray Absorption Spectra.将可解释性集成到用于预测X射线吸收光谱的图神经网络模型中。

J Am Chem Soc. 2023 Oct 18;145(41):22584-22598. doi: 10.1021/jacs.3c07513. Epub 2023 Oct 9.

Explaining compound activity predictions with a substructure-aware loss for graph neural networks.利用图神经网络的子结构感知损失解释化合物活性预测

J Cheminform. 2023 Jul 25;15(1):67. doi: 10.1186/s13321-023-00733-9.

Large-scale prediction of activity cliffs using machine and deep learning methods of increasing complexity.使用复杂度不断增加的机器学习和深度学习方法对活性断崖进行大规模预测。

J Cheminform. 2023 Jan 7;15(1):4. doi: 10.1186/s13321-022-00676-7.

Quantitative evaluation of explainable graph neural networks for molecular property prediction.用于分子性质预测的可解释图神经网络的定量评估

Patterns (N Y). 2022 Nov 10;3(12):100628. doi: 10.1016/j.patter.2022.100628. eCollection 2022 Dec 9.

Exposing the Limitations of Molecular Machine Learning with Activity Cliffs.利用活性悬崖揭示分子机器学习的局限性。

J Chem Inf Model. 2022 Dec 12;62(23):5938-5951. doi: 10.1021/acs.jcim.2c01073. Epub 2022 Dec 1.

EdgeSHAPer: Bond-centric Shapley value-based explanation method for graph neural networks.EdgeSHAPer：用于图神经网络的基于边中心Shapley值的解释方法。

iScience. 2022 Aug 30;25(10):105043. doi: 10.1016/j.isci.2022.105043. eCollection 2022 Oct 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于活性悬崖的分子特征归因方法的基准测试。

Benchmarking Molecular Feature Attribution Methods with Activity Cliffs.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献