• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图注意力机制的有机化合物合成可及性预测

Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism.

作者信息

Yu Jiahui, Wang Jike, Zhao Hong, Gao Junbo, Kang Yu, Cao Dongsheng, Wang Zhe, Hou Tingjun

机构信息

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China.

School of Computer Science, Wuhan University, Wuhan 430072, Hubei, P. R. China.

出版信息

J Chem Inf Model. 2022 Jun 27;62(12):2973-2986. doi: 10.1021/acs.jcim.2c00038. Epub 2022 Jun 8.

DOI:10.1021/acs.jcim.2c00038
PMID:35675668
Abstract

Accurate estimation of the synthetic accessibility of small molecules is needed in many phases of drug discovery. Several expert-crafted scoring methods and descriptor-based quantitative structure-activity relationship (QSAR) models have been developed for synthetic accessibility assessment, but their practical applications in drug discovery are still quite limited because of relatively low prediction accuracy and poor model interpretability. In this study, we proposed a data-driven interpretable prediction framework called GASA (Graph Attention-based assessment of Synthetic Accessibility) to evaluate the synthetic accessibility of small molecules by distinguishing compounds to be easy- (ES) or hard-to-synthesize (HS). GASA is a graph neural network (GNN) architecture that makes self-feature deduction by applying an attention mechanism to automatically capture the most important structural features related to synthetic accessibility. The sampling around the hypothetical classification boundary was used to improve the ability of GASA to distinguish structurally similar molecules. GASA was extensively evaluated and compared with two descriptor-based machine learning methods (random forest, RF; eXtreme gradient boosting, XGBoost) and four existing scores (SYBA: SYnthetic Bayesian Accessibility; SCScore: Synthetic Complexity score; RAscore: Retrosynthetic Accessibility score; SAscore: Synthetic Accessibility score). Our analysis demonstrates that GASA achieved remarkable performance in distinguishing similar molecules compared with other methods and had a broader applicability domain. In addition, we show how GASA learns the important features that affect molecular synthetic accessibility by assigning attention weights to different atoms. An online prediction service for GASA was offered at http://cadd.zju.edu.cn/gasa/.

摘要

在药物发现的许多阶段,都需要准确估计小分子的合成可及性。已经开发了几种由专家精心设计的评分方法和基于描述符的定量构效关系(QSAR)模型用于合成可及性评估,但由于预测准确性相对较低和模型可解释性差,它们在药物发现中的实际应用仍然相当有限。在本研究中,我们提出了一种数据驱动的可解释预测框架,称为GASA(基于图注意力的合成可及性评估),通过区分易于合成(ES)或难以合成(HS)的化合物来评估小分子的合成可及性。GASA是一种图神经网络(GNN)架构,它通过应用注意力机制进行自特征推导,以自动捕获与合成可及性相关的最重要结构特征。在假设分类边界周围进行采样,以提高GASA区分结构相似分子的能力。我们对GASA进行了广泛评估,并与两种基于描述符的机器学习方法(随机森林,RF;极端梯度提升,XGBoost)和四个现有分数(SYBA:合成贝叶斯可及性;SCScore:合成复杂性分数;RAscore:逆合成可及性分数;SAscore:合成可及性分数)进行了比较。我们的分析表明,与其他方法相比,GASA在区分相似分子方面表现出色,并且具有更广泛的适用范围。此外,我们展示了GASA如何通过为不同原子分配注意力权重来学习影响分子合成可及性的重要特征。可通过http://cadd.zju.edu.cn/gasa/获得GASA的在线预测服务。

相似文献

1
Organic Compound Synthetic Accessibility Prediction Based on the Graph Attention Mechanism.基于图注意力机制的有机化合物合成可及性预测
J Chem Inf Model. 2022 Jun 27;62(12):2973-2986. doi: 10.1021/acs.jcim.2c00038. Epub 2022 Jun 8.
2
SYBA: Bayesian estimation of synthetic accessibility of organic compounds.SYBA:有机化合物合成可及性的贝叶斯估计
J Cheminform. 2020 May 20;12(1):35. doi: 10.1186/s13321-020-00439-2.
3
Prediction of Compound Synthesis Accessibility Based on Reaction Knowledge Graph.基于反应知识图谱的化合物合成可及性预测。
Molecules. 2022 Feb 3;27(3):1039. doi: 10.3390/molecules27031039.
4
Critical assessment of synthetic accessibility scores in computer-assisted synthesis planning.计算机辅助合成规划中合成可及性分数的批判性评估
J Cheminform. 2023 Jan 14;15(1):6. doi: 10.1186/s13321-023-00678-z.
5
Retention time prediction in hydrophilic interaction liquid chromatography with graph neural network and transfer learning.基于图神经网络和迁移学习的亲水相互作用液相色谱保留时间预测。
J Chromatogr A. 2021 Oct 25;1656:462536. doi: 10.1016/j.chroma.2021.462536. Epub 2021 Sep 7.
6
Improved GNNs for Log  Prediction by Transferring Knowledge from Low-Fidelity Data.通过从低质量数据转移知识来改进图神经网络进行日志预测。
J Chem Inf Model. 2023 Apr 24;63(8):2345-2359. doi: 10.1021/acs.jcim.2c01564. Epub 2023 Mar 31.
7
MD-GNN: A mechanism-data-driven graph neural network for molecular properties prediction and new material discovery.MD-GNN:一种基于机制数据的图神经网络,用于分子性质预测和新材料发现。
J Mol Graph Model. 2023 Sep;123:108506. doi: 10.1016/j.jmgm.2023.108506. Epub 2023 May 9.
8
Combining Group-Contribution Concept and Graph Neural Networks Toward Interpretable Molecular Property Models.结合基团贡献概念和图神经网络实现可解释的分子性质模型
J Chem Inf Model. 2023 Feb 13;63(3):725-744. doi: 10.1021/acs.jcim.2c01091. Epub 2023 Jan 30.
9
Identification of active molecules against Mycobacterium tuberculosis through machine learning.通过机器学习鉴定抗结核分枝杆菌的活性分子。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab068.
10
Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models.图神经网络能否为药物发现学习更好的分子表示?基于描述符和基于图的模型的比较研究。
J Cheminform. 2021 Feb 17;13(1):12. doi: 10.1186/s13321-020-00479-8.

引用本文的文献

1
Guided multi-objective generative AI to enhance structure-based drug design.引导式多目标生成式人工智能增强基于结构的药物设计。
Chem Sci. 2025 May 29. doi: 10.1039/d5sc01778e.
2
Explainable Artificial Intelligence in the Field of Drug Research.药物研究领域中的可解释人工智能
Drug Des Devel Ther. 2025 May 29;19:4501-4516. doi: 10.2147/DDDT.S525171. eCollection 2025.
3
A View on Molecular Complexity from the GDB Chemical Space.从GDB化学空间看分子复杂性
J Chem Inf Model. 2025 Aug 25;65(16):8405-8410. doi: 10.1021/acs.jcim.5c00334. Epub 2025 May 15.
4
Learning motif features and topological structure of molecules for metabolic pathway prediction.学习用于代谢途径预测的分子基序特征和拓扑结构。
J Cheminform. 2025 Apr 21;17(1):56. doi: 10.1186/s13321-025-00994-6.
5
A modular artificial intelligence framework to facilitate fluorophore design.一种促进荧光团设计的模块化人工智能框架。
Nat Commun. 2025 Apr 16;16(1):3598. doi: 10.1038/s41467-025-58881-5.
6
Generate what you can make: achieving in-house synthesizability with readily available resources in de novo drug design.利用现有资源实现从头药物设计中的内部合成可行性:生成你所能制备的物质。
J Cheminform. 2025 Mar 28;17(1):41. doi: 10.1186/s13321-024-00910-4.
7
Syn-MolOpt: a synthesis planning-driven molecular optimization method using data-derived functional reaction templates.Syn-MolOpt:一种使用数据驱动的功能反应模板的合成规划驱动的分子优化方法。
J Cheminform. 2025 Mar 2;17(1):27. doi: 10.1186/s13321-025-00975-9.
8
Computer aided design of inhibitor molecules against Vpr protein from different HIV-1 subtypes.针对不同HIV-1亚型的Vpr蛋白的抑制剂分子的计算机辅助设计
In Silico Pharmacol. 2025 Feb 8;13(1):23. doi: 10.1007/s40203-025-00318-4. eCollection 2025.
9
Molecular optimization using a conditional transformer for reaction-aware compound exploration with reinforcement learning.使用条件变压器进行分子优化,通过强化学习实现反应感知化合物探索。
Commun Chem. 2025 Feb 8;8(1):40. doi: 10.1038/s42004-025-01437-x.
10
ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning.ClickGen:通过模块化反应和强化学习定向探索可综合化学空间。
Nat Commun. 2024 Nov 22;15(1):10127. doi: 10.1038/s41467-024-54456-y.