• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

跨模态和自监督的蛋白质嵌入方法用于化合物-蛋白质亲和力和接触预测。

Cross-modality and self-supervised protein embedding for compound-protein affinity and contact prediction.

机构信息

Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.

Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843, USA.

出版信息

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii68-ii74. doi: 10.1093/bioinformatics/btac470.

DOI:10.1093/bioinformatics/btac470
PMID:36124802
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9486597/
Abstract

MOTIVATION

Computational methods for compound-protein affinity and contact (CPAC) prediction aim at facilitating rational drug discovery by simultaneous prediction of the strength and the pattern of compound-protein interactions. Although the desired outputs are highly structure-dependent, the lack of protein structures often makes structure-free methods rely on protein sequence inputs alone. The scarcity of compound-protein pairs with affinity and contact labels further limits the accuracy and the generalizability of CPAC models.

RESULTS

To overcome the aforementioned challenges of structure naivety and labeled-data scarcity, we introduce cross-modality and self-supervised learning, respectively, for structure-aware and task-relevant protein embedding. Specifically, protein data are available in both modalities of 1D amino-acid sequences and predicted 2D contact maps that are separately embedded with recurrent and graph neural networks, respectively, as well as jointly embedded with two cross-modality schemes. Furthermore, both protein modalities are pre-trained under various self-supervised learning strategies, by leveraging massive amount of unlabeled protein data. Our results indicate that individual protein modalities differ in their strengths of predicting affinities or contacts. Proper cross-modality protein embedding combined with self-supervised learning improves model generalizability when predicting both affinities and contacts for unseen proteins.

AVAILABILITY AND IMPLEMENTATION

Data and source codes are available at https://github.com/Shen-Lab/CPAC.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

化合物-蛋白质亲和力和接触(CPAC)预测的计算方法旨在通过同时预测化合物-蛋白质相互作用的强度和模式来促进合理的药物发现。尽管所需的输出高度依赖于结构,但缺乏蛋白质结构往往使得无结构方法仅依赖于蛋白质序列输入。具有亲和力和接触标签的化合物-蛋白质对的稀缺性进一步限制了 CPAC 模型的准确性和泛化能力。

结果

为了克服上述结构盲目性和标记数据稀缺性的挑战,我们分别引入了跨模态和自监督学习,用于有感知结构和相关任务的蛋白质嵌入。具体来说,蛋白质数据在 1D 氨基酸序列和预测的 2D 接触图这两种模态中都可用,分别使用递归神经网络和图神经网络进行嵌入,以及使用两种跨模态方案进行联合嵌入。此外,两种蛋白质模态都在各种自监督学习策略下进行了预训练,利用了大量未标记的蛋白质数据。我们的结果表明,单独的蛋白质模态在预测亲和力或接触方面的能力存在差异。适当的跨模态蛋白质嵌入结合自监督学习可以提高模型在预测未见蛋白质的亲和力和接触时的泛化能力。

可用性和实现

数据和源代码可在 https://github.com/Shen-Lab/CPAC 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Cross-modality and self-supervised protein embedding for compound-protein affinity and contact prediction.跨模态和自监督的蛋白质嵌入方法用于化合物-蛋白质亲和力和接触预测。
Bioinformatics. 2022 Sep 16;38(Suppl_2):ii68-ii74. doi: 10.1093/bioinformatics/btac470.
2
DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks.DeepAffinity:通过统一的递归和卷积神经网络实现化合物-蛋白质亲和力的可解释深度学习。
Bioinformatics. 2019 Sep 15;35(18):3329-3338. doi: 10.1093/bioinformatics/btz111.
3
Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts.用于预测化合物-蛋白质亲和力和接触的可解释深度关系网络。
J Chem Inf Model. 2021 Jan 25;61(1):46-66. doi: 10.1021/acs.jcim.0c00866. Epub 2020 Dec 21.
4
TALE: Transformer-based protein function Annotation with joint sequence-Label Embedding.TALE:基于 Transformer 的蛋白质功能注释与联合序列-标签嵌入。
Bioinformatics. 2021 Sep 29;37(18):2825-2833. doi: 10.1093/bioinformatics/btab198.
5
DeepDTA: deep drug-target binding affinity prediction.深度 DTA:深度药物-靶标结合亲和力预测。
Bioinformatics. 2018 Sep 1;34(17):i821-i829. doi: 10.1093/bioinformatics/bty593.
6
Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.利用部分标记的噪声学生自训练和自监督图嵌入探索化学空间。
BMC Bioinformatics. 2022 May 2;23(Suppl 3):158. doi: 10.1186/s12859-022-04681-3.
7
BACPI: a bi-directional attention neural network for compound-protein interaction and binding affinity prediction.BACPI:一种用于化合物-蛋白质相互作用和结合亲和力预测的双向注意力神经网络。
Bioinformatics. 2022 Mar 28;38(7):1995-2002. doi: 10.1093/bioinformatics/btac035.
8
DEAttentionDTA: protein-ligand binding affinity prediction based on dynamic embedding and self-attention.DEAttentionDTA:基于动态嵌入和自注意力的蛋白质-配体结合亲和力预测。
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae319.
9
Pre-training graph neural networks for link prediction in biomedical networks.用于生物医学网络中链接预测的预训练图神经网络。
Bioinformatics. 2022 Apr 12;38(8):2254-2262. doi: 10.1093/bioinformatics/btac100.
10
Improved compound-protein interaction site and binding affinity prediction using self-supervised protein embeddings.利用自监督蛋白质嵌入提高化合物-蛋白质相互作用位点和结合亲和力预测。
BMC Bioinformatics. 2022 Dec 16;23(1):543. doi: 10.1186/s12859-022-05107-w.

引用本文的文献

1
Graph neural pre-training based drug-target affinity prediction.基于图神经网络预训练的药物-靶点亲和力预测
Front Genet. 2024 Sep 16;15:1452339. doi: 10.3389/fgene.2024.1452339. eCollection 2024.
2
A review of deep learning methods for ligand based drug virtual screening.基于配体的药物虚拟筛选的深度学习方法综述。
Fundam Res. 2024 Mar 11;4(4):715-737. doi: 10.1016/j.fmre.2024.02.011. eCollection 2024 Jul.
3
Signals in the Cells: Multimodal and Contextualized Machine Learning Foundations for Therapeutics.细胞中的信号:治疗学的多模态与情境化机器学习基础
bioRxiv. 2024 Nov 12:2024.06.12.598655. doi: 10.1101/2024.06.12.598655.
4
SSLpheno: a self-supervised learning approach for gene-phenotype association prediction using protein-protein interactions and gene ontology data.SSLpheno:一种基于自监督学习的方法,利用蛋白质-蛋白质相互作用和基因本体数据进行基因-表型关联预测。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad662.
5
Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations.提出自己的观点:无需预制数据增强的图对比学习
Proc Int Conf Web Search Data Min. 2022 Feb;2022:1300-1309. doi: 10.1145/3488560.3498416. Epub 2022 Feb 15.

本文引用的文献

1
Bringing Your Own View: Graph Contrastive Learning without Prefabricated Data Augmentations.提出自己的观点:无需预制数据增强的图对比学习
Proc Int Conf Web Search Data Min. 2022 Feb;2022:1300-1309. doi: 10.1145/3488560.3498416. Epub 2022 Feb 15.
2
Drug-target affinity prediction using graph neural network and contact maps.使用图神经网络和接触图进行药物-靶点亲和力预测。
RSC Adv. 2020 Jun 1;10(35):20701-20712. doi: 10.1039/d0ra02297g. eCollection 2020 May 27.
3
Accurate prediction of protein structures and interactions using a three-track neural network.使用三轨神经网络准确预测蛋白质结构和相互作用。
Science. 2021 Aug 20;373(6557):871-876. doi: 10.1126/science.abj8754. Epub 2021 Jul 15.
4
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
5
Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts.用于预测化合物-蛋白质亲和力和接触的可解释深度关系网络。
J Chem Inf Model. 2021 Jan 25;61(1):46-66. doi: 10.1021/acs.jcim.0c00866. Epub 2020 Dec 21.
6
When Does Self-Supervision Help Graph Convolutional Networks?自监督何时对图卷积网络有帮助?
Proc Mach Learn Res. 2020 Jul;119:10871-10880.
7
Pfam: The protein families database in 2021.Pfam:2021 年的蛋白质家族数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.
8
Energy-based graph convolutional networks for scoring protein docking models.基于能量的图卷积网络在蛋白质对接模型评分中的应用。
Proteins. 2020 Aug;88(8):1091-1099. doi: 10.1002/prot.25888. Epub 2020 Mar 16.
9
Distance-based protein folding powered by deep learning.基于深度学习的距离相关蛋白质折叠。
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):16856-16865. doi: 10.1073/pnas.1821309116. Epub 2019 Aug 9.
10
DeepAffinity: interpretable deep learning of compound-protein affinity through unified recurrent and convolutional neural networks.DeepAffinity:通过统一的递归和卷积神经网络实现化合物-蛋白质亲和力的可解释深度学习。
Bioinformatics. 2019 Sep 15;35(18):3329-3338. doi: 10.1093/bioinformatics/btz111.