• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MolEM:分子图与序列顺序的统一生成框架。

MolEM: a unified generative framework for molecular graphs and sequential orders.

作者信息

Zhang Hanwen, Xiong Deng, Liu Xianggen, Lv Jiancheng

机构信息

College of Computer Science, Sichuan University, No.24 South Section 1, Yihuan Road, Chengdu 610065, China.

Engineering Research Center of Machine Learning and Industry Intelligence, Ministry of Education, No.24 South Section 1, Yihuan Road, Chengdu 610065, China.

出版信息

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf094.

DOI:10.1093/bib/bbaf094
PMID:40163755
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11957264/
Abstract

Structure-based drug design aims to generate molecules that fill the cavity of the protein pocket with a high binding affinity. Many contemporary studies employ sequential generative models. Their standard training method is to sequentialize molecular graphs into ordered sequences and then maximize the likelihood of the resulting sequences. However, the exact likelihood is computationally intractable, which involves a sum over all possible sequential orders. Molecular graphs lack an inherent order and the number of orders is factorial in the graph size. To avoid the intractable full space of factorially-many orders, existing works pre-define a fixed node ordering scheme such as depth-first search to sequentialize the 3D molecular graphs. In these cases, the training objectives are loose lower bounds of the exact likelihoods which are suboptimal for generation. To address the challenges, we propose a unified generative framework named MolEM to learn the 3D molecular graphs and corresponding sequential orders jointly. We derive a tight lower bound of the likelihood and maximize it via variational expectation-maximization algorithm, opening a new line of research in learning-based ordering schemes for 3D molecular graph generation. Besides, we first incorporate the molecular docking method QuickVina 2 to manipulate the binding poses, leading to accurate and flexible ligand conformations. Experimental results demonstrate that MolEM significantly outperforms baseline models in generating molecules with high binding affinities and realistic structures. Our approach efficiently approximates the true marginal graph likelihood and identifies reasonable orderings for 3D molecular graphs, aligning well with relevant chemical priors.

摘要

基于结构的药物设计旨在生成具有高结合亲和力的分子,以填充蛋白质口袋的空腔。许多当代研究采用序列生成模型。其标准训练方法是将分子图序列化为有序序列,然后最大化所得序列的似然性。然而,精确的似然性在计算上是难以处理的,这涉及对所有可能的序列顺序进行求和。分子图缺乏固有的顺序,并且顺序的数量在图的大小上是阶乘的。为了避免阶乘数量级的难以处理的全空间,现有工作预先定义了一种固定的节点排序方案,如深度优先搜索,以将3D分子图序列化。在这些情况下,训练目标是精确似然性的宽松下界,对于生成来说是次优的。为了应对这些挑战,我们提出了一个名为MolEM的统一生成框架,以联合学习3D分子图和相应的序列顺序。我们推导了似然性的紧密下界,并通过变分期望最大化算法对其进行最大化,为基于学习的3D分子图生成排序方案开辟了一条新的研究路线。此外,我们首先纳入分子对接方法QuickVina 2来操纵结合姿势,从而得到准确且灵活的配体构象。实验结果表明,MolEM在生成具有高结合亲和力和逼真结构的分子方面显著优于基线模型。我们的方法有效地近似了真实的边际图似然性,并为3D分子图识别了合理的排序,与相关化学先验知识高度吻合。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/e0a818ef36fe/bbaf094f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/8747ca4f791d/bbaf094f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/6f2eec4e4a3d/bbaf094f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/f744f1ef0f63/bbaf094f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/e0a818ef36fe/bbaf094f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/8747ca4f791d/bbaf094f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/6f2eec4e4a3d/bbaf094f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/f744f1ef0f63/bbaf094f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/980a/11957264/e0a818ef36fe/bbaf094f4.jpg

相似文献

1
MolEM: a unified generative framework for molecular graphs and sequential orders.MolEM:分子图与序列顺序的统一生成框架。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf094.
2
Efficient learning of non-autoregressive graph variational autoencoders for molecular graph generation.用于分子图生成的非自回归图变分自编码器的高效学习。
J Cheminform. 2019 Nov 21;11(1):70. doi: 10.1186/s13321-019-0396-x.
3
Multi-objective de novo drug design with conditional graph generative model.基于条件图生成模型的多目标从头药物设计
J Cheminform. 2018 Jul 24;10(1):33. doi: 10.1186/s13321-018-0287-6.
4
FraHMT: A Fragment-Oriented Heterogeneous Graph Molecular Generation Model for Target Proteins.FraHMT:一种面向目标蛋白的基于片段的异质图分子生成模型。
J Chem Inf Model. 2024 May 13;64(9):3718-3732. doi: 10.1021/acs.jcim.4c00252. Epub 2024 Apr 22.
5
An equivariant generative framework for molecular graph-structure Co-design.用于分子图结构协同设计的等变生成框架。
Chem Sci. 2023 Jul 19;14(31):8380-8392. doi: 10.1039/d3sc02538a. eCollection 2023 Aug 9.
6
How Good are Current Pocket-Based 3D Generative Models?: The Benchmark Set and Evaluation of Protein Pocket-Based 3D Molecular Generative Models.当前基于口袋的3D生成模型有多好?:基于蛋白质口袋的3D分子生成模型的基准集与评估
J Chem Inf Model. 2024 Dec 23;64(24):9260-9275. doi: 10.1021/acs.jcim.4c01598. Epub 2024 Dec 4.
7
High-Temperature Polymer Dielectrics Designed Using an Invertible Molecular Graph Generative Model.采用可逆变分子图生成模型设计的高温聚合物电介质。
J Chem Inf Model. 2023 Dec 25;63(24):7669-7675. doi: 10.1021/acs.jcim.3c01572. Epub 2023 Dec 7.
8
Energy-based graph convolutional networks for scoring protein docking models.基于能量的图卷积网络在蛋白质对接模型评分中的应用。
Proteins. 2020 Aug;88(8):1091-1099. doi: 10.1002/prot.25888. Epub 2020 Mar 16.
9
Protein-Ligand Blind Docking Using QuickVina-W With Inter-Process Spatio-Temporal Integration.使用带有进程间时空整合的QuickVina-W进行蛋白质-配体盲对接
Sci Rep. 2017 Nov 13;7(1):15451. doi: 10.1038/s41598-017-15571-7.
10
Drug-target affinity prediction with extended graph learning-convolutional networks.基于扩展图学习卷积网络的药物-靶标亲和力预测。
BMC Bioinformatics. 2024 Feb 16;25(1):75. doi: 10.1186/s12859-024-05698-6.

本文引用的文献

1
Structure-based drug design with equivariant diffusion models.基于结构的药物设计与等变扩散模型
Nat Comput Sci. 2024 Dec;4(12):899-909. doi: 10.1038/s43588-024-00737-x. Epub 2024 Dec 9.
2
DiffBindFR: an SE(3) equivariant network for flexible protein-ligand docking.DiffBindFR:一种用于灵活蛋白质-配体对接的SE(3)等变网络。
Chem Sci. 2024 Apr 9;15(21):7926-7942. doi: 10.1039/d3sc06803j. eCollection 2024 May 29.
3
Structure prediction of protein-ligand complexes from sequence information with Umol.利用 Umol 从序列信息预测蛋白质-配体复合物的结构。
Nat Commun. 2024 May 28;15(1):4536. doi: 10.1038/s41467-024-48837-6.
4
MISATO: machine learning dataset of protein-ligand complexes for structure-based drug discovery.MISATO:基于结构的药物发现的蛋白质-配体复合物的机器学习数据集。
Nat Comput Sci. 2024 May;4(5):367-378. doi: 10.1038/s43588-024-00627-2. Epub 2024 May 10.
5
3D molecular generative framework for interaction-guided drug design.用于基于相互作用的药物设计的 3D 分子生成框架。
Nat Commun. 2024 Mar 27;15(1):2688. doi: 10.1038/s41467-024-47011-2.
6
A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets.双扩散模型能够基于靶口袋进行 3D 分子生成和先导化合物优化。
Nat Commun. 2024 Mar 26;15(1):2657. doi: 10.1038/s41467-024-46569-1.
7
Efficient and accurate large library ligand docking with KarmaDock.使用 KarmaDock 实现高效准确的大型配体库对接。
Nat Comput Sci. 2023 Sep;3(9):789-804. doi: 10.1038/s43588-023-00511-5. Epub 2023 Sep 21.
8
Learning on topological surface and geometric structure for 3D molecular generation.基于拓扑表面和几何结构的三维分子生成学习。
Nat Comput Sci. 2023 Oct;3(10):849-859. doi: 10.1038/s43588-023-00530-2. Epub 2023 Oct 9.
9
Geometric Deep Learning for Structure-Based Ligand Design.用于基于结构的配体设计的几何深度学习
ACS Cent Sci. 2023 Nov 17;9(12):2257-2267. doi: 10.1021/acscentsci.3c00572. eCollection 2023 Dec 27.
10
A flexible data-free framework for structure-based drug design with reinforcement learning.一种用于基于结构的药物设计的灵活的无数据强化学习框架。
Chem Sci. 2023 Oct 19;14(43):12166-12181. doi: 10.1039/d3sc04091g. eCollection 2023 Nov 8.