• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于药物发现的学习表达性分子全局表示的有效自监督框架。

An effective self-supervised framework for learning expressive molecular global representations to drug discovery.

机构信息

Department of Biomedical Engineering at Tsinghua University, China.

Ping An Healthcare Technology, Chaoyang, 100027 Beijing, China.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab109.

DOI:10.1093/bib/bbab109
PMID:33940598
Abstract

How to produce expressive molecular representations is a fundamental challenge in artificial intelligence-driven drug discovery. Graph neural network (GNN) has emerged as a powerful technique for modeling molecular data. However, previous supervised approaches usually suffer from the scarcity of labeled data and poor generalization capability. Here, we propose a novel molecular pre-training graph-based deep learning framework, named MPG, that learns molecular representations from large-scale unlabeled molecules. In MPG, we proposed a powerful GNN for modelling molecular graph named MolGNet, and designed an effective self-supervised strategy for pre-training the model at both the node and graph-level. After pre-training on 11 million unlabeled molecules, we revealed that MolGNet can capture valuable chemical insights to produce interpretable representation. The pre-trained MolGNet can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of drug discovery tasks, including molecular properties prediction, drug-drug interaction and drug-target interaction, on 14 benchmark datasets. The pre-trained MolGNet in MPG has the potential to become an advanced molecular encoder in the drug discovery pipeline.

摘要

如何生成富有表现力的分子表示是人工智能驱动的药物发现中的一个基本挑战。图神经网络(GNN)已成为建模分子数据的强大技术。然而,以前的监督方法通常受到标记数据的稀缺性和较差的泛化能力的影响。在这里,我们提出了一种新颖的基于图的分子预训练深度学习框架,名为 MPG,它可以从大规模未标记的分子中学习分子表示。在 MPG 中,我们提出了一种用于建模分子图的强大 GNN,名为 MolGNet,并设计了一种有效的自监督策略,用于在节点和图级别上对模型进行预训练。在对 1100 万个未标记的分子进行预训练后,我们发现 MolGNet 可以捕获有价值的化学见解,以生成可解释的表示。经过微调,只需再添加一个输出层,就可以在 14 个基准数据集上为广泛的药物发现任务(包括分子性质预测、药物-药物相互作用和药物-靶标相互作用)创建最先进的模型。MPG 中的预训练 MolGNet 有可能成为药物发现管道中的高级分子编码器。

相似文献

1
An effective self-supervised framework for learning expressive molecular global representations to drug discovery.用于药物发现的学习表达性分子全局表示的有效自监督框架。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab109.
2
BatmanNet: bi-branch masked graph transformer autoencoder for molecular representation.BatmanNet:用于分子表示的双分支掩蔽图 Transformer 自动编码器。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad400.
3
MG-BERT: leveraging unsupervised atomic representation learning for molecular property prediction.MG-BERT:利用无监督原子表示学习进行分子性质预测。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab152.
4
Hierarchical Molecular Graph Self-Supervised Learning for property prediction.用于属性预测的分层分子图自监督学习
Commun Chem. 2023 Feb 17;6(1):34. doi: 10.1038/s42004-023-00825-5.
5
Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.利用部分标记的噪声学生自训练和自监督图嵌入探索化学空间。
BMC Bioinformatics. 2022 May 2;23(Suppl 3):158. doi: 10.1186/s12859-022-04681-3.
6
Augmented Graph Neural Network with hierarchical global-based residual connections.基于层次全局残差连接的增强图神经网络。
Neural Netw. 2022 Jun;150:149-166. doi: 10.1016/j.neunet.2022.03.008. Epub 2022 Mar 10.
7
Pre-training graph neural networks for link prediction in biomedical networks.用于生物医学网络中链接预测的预训练图神经网络。
Bioinformatics. 2022 Apr 12;38(8):2254-2262. doi: 10.1093/bioinformatics/btac100.
8
Self-supervised learning with chemistry-aware fragmentation for effective molecular property prediction.基于化学感知碎裂的自监督学习可有效预测分子性质。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad296.
9
GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery.通用 DTA:结合预训练和多任务学习,预测未知药物发现的药物-靶标结合亲和力。
BMC Bioinformatics. 2022 Sep 7;23(1):367. doi: 10.1186/s12859-022-04905-6.
10
A knowledge-guided pre-training framework for improving molecular representation learning.一种基于知识引导的预训练框架,用于改进分子表示学习。
Nat Commun. 2023 Nov 21;14(1):7568. doi: 10.1038/s41467-023-43214-1.

引用本文的文献

1
GraphGIM: rethinking molecular graph contrastive learning via geometry image modeling.GraphGIM:通过几何图像建模重新思考分子图对比学习
BMC Biol. 2025 Jul 1;23(1):189. doi: 10.1186/s12915-025-02249-0.
2
Integrated multimodal hierarchical fusion and meta-learning for enhanced molecular property prediction.用于增强分子性质预测的集成多模态分层融合与元学习
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf251.
3
A self-conformation-aware pre-training framework for molecular property prediction with substructure interpretability.
一种具有子结构可解释性的用于分子性质预测的自构象感知预训练框架。
Nat Commun. 2025 May 12;16(1):4382. doi: 10.1038/s41467-025-59634-0.
4
Harnessing pre-trained models for accurate prediction of protein-ligand binding affinity.利用预训练模型准确预测蛋白质-配体结合亲和力。
BMC Bioinformatics. 2025 Feb 17;26(1):55. doi: 10.1186/s12859-025-06064-w.
5
A Multi-Task Self-Supervised Strategy for Predicting Molecular Properties and FGFR1 Inhibitors.一种用于预测分子性质和FGFR1抑制剂的多任务自监督策略
Adv Sci (Weinh). 2025 Apr;12(13):e2412987. doi: 10.1002/advs.202412987. Epub 2025 Feb 8.
6
MultiChem: predicting chemical properties using multi-view graph attention network.多化学:使用多视图图注意力网络预测化学性质。
BioData Min. 2025 Jan 16;18(1):4. doi: 10.1186/s13040-024-00419-4.
7
DeepDR: a deep learning library for drug response prediction.DeepDR:一个用于药物反应预测的深度学习库。
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae688.
8
PTB-DDI: An Accurate and Simple Framework for Drug-Drug Interaction Prediction Based on Pre-Trained Tokenizer and BiLSTM Model.PTB-DDI:基于预训练分词器和 BiLSTM 模型的准确且简单的药物相互作用预测框架。
Int J Mol Sci. 2024 Oct 23;25(21):11385. doi: 10.3390/ijms252111385.
9
Recent advances from computer-aided drug design to artificial intelligence drug design.从计算机辅助药物设计到人工智能药物设计的最新进展。
RSC Med Chem. 2024 Oct 11;15(12):3978-4000. doi: 10.1039/d4md00522h.
10
PharmaBench: Enhancing ADMET benchmarks with large language models.药代动力学/药效学基准测试:利用大型语言模型增强。
Sci Data. 2024 Sep 10;11(1):985. doi: 10.1038/s41597-024-03793-0.