Suppr超能文献

基于活性悬崖的分子特征归因方法的基准测试。

Benchmarking Molecular Feature Attribution Methods with Activity Cliffs.

机构信息

Department of Chemistry and Applied Biosciences, RETHINK, ETH Zurich, 8093 Zurich, Switzerland.

Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397 Biberach an der Riss, Germany.

出版信息

J Chem Inf Model. 2022 Jan 24;62(2):274-283. doi: 10.1021/acs.jcim.1c01163. Epub 2022 Jan 12.

Abstract

Feature attribution techniques are popular choices within the explainable artificial intelligence toolbox, as they can help elucidate which parts of the provided inputs used by an underlying supervised-learning method are considered relevant for a specific prediction. In the context of molecular design, these approaches typically involve the coloring of molecular graphs, whose presentation to medicinal chemists can be useful for making a decision of which compounds to synthesize or prioritize. The consistency of the highlighted moieties alongside expert background knowledge is expected to contribute to the understanding of machine-learning models in drug design. Quantitative evaluation of such coloring approaches, however, has so far been limited to substructure identification tasks. We here present an approach that is based on maximum common substructure algorithms applied to experimentally-determined activity cliffs. Using the proposed benchmark, we found that molecule coloring approaches in conjunction with classical machine-learning models tend to outperform more modern, graph-neural-network alternatives. The provided benchmark data are fully open sourced, which we hope will facilitate the testing of newly developed molecular feature attribution techniques.

摘要

特征归因技术是可解释人工智能工具包中的热门选择,因为它们可以帮助阐明基础监督学习方法所使用的输入的哪些部分与特定预测相关。在分子设计的背景下,这些方法通常涉及分子图的着色,向药物化学家展示这些分子图对于做出合成或优先考虑哪些化合物的决策很有用。突出部分与专家背景知识的一致性有望有助于理解药物设计中的机器学习模型。然而,对这种着色方法的定量评估迄今为止仅限于子结构识别任务。我们在这里提出了一种基于最大公共子结构算法应用于实验确定的活性悬崖的方法。使用提出的基准,我们发现与更现代的图神经网络替代方案相比,结合经典机器学习模型的分子着色方法往往表现更好。提供的基准数据是完全开源的,我们希望这将有助于测试新开发的分子特征归因技术。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验