• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于简化同构图卷积网络和预训练语言模型预测化合物-蛋白质相互作用的端到端方法。

An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model.

作者信息

Zhang Yufang, Li Jiayi, Lin Shenggeng, Zhao Jianwei, Xiong Yi, Wei Dong-Qing

机构信息

School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, 200240, China.

Peng Cheng Laboratory, Shenzhen, 518055, Guangdong, China.

出版信息

J Cheminform. 2024 Jun 7;16(1):67. doi: 10.1186/s13321-024-00862-9.

DOI:10.1186/s13321-024-00862-9
PMID:38849874
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11162000/
Abstract

Identification of interactions between chemical compounds and proteins is crucial for various applications, including drug discovery, target identification, network pharmacology, and elucidation of protein functions. Deep neural network-based approaches are becoming increasingly popular in efficiently identifying compound-protein interactions with high-throughput capabilities, narrowing down the scope of candidates for traditional labor-intensive, time-consuming and expensive experimental techniques. In this study, we proposed an end-to-end approach termed SPVec-SGCN-CPI, which utilized simplified graph convolutional network (SGCN) model with low-dimensional and continuous features generated from our previously developed model SPVec and graph topology information to predict compound-protein interactions. The SGCN technique, dividing the local neighborhood aggregation and nonlinearity layer-wise propagation steps, effectively aggregates K-order neighbor information while avoiding neighbor explosion and expediting training. The performance of the SPVec-SGCN-CPI method was assessed across three datasets and compared against four machine learning- and deep learning-based methods, as well as six state-of-the-art methods. Experimental results revealed that SPVec-SGCN-CPI outperformed all these competing methods, particularly excelling in unbalanced data scenarios. By propagating node features and topological information to the feature space, SPVec-SGCN-CPI effectively incorporates interactions between compounds and proteins, enabling the fusion of heterogeneity. Furthermore, our method scored all unlabeled data in ChEMBL, confirming the top five ranked compound-protein interactions through molecular docking and existing evidence. These findings suggest that our model can reliably uncover compound-protein interactions within unlabeled compound-protein pairs, carrying substantial implications for drug re-profiling and discovery. In summary, SPVec-SGCN demonstrates its efficacy in accurately predicting compound-protein interactions, showcasing potential to enhance target identification and streamline drug discovery processes.Scientific contributionsThe methodology presented in this work not only enables the comparatively accurate prediction of compound-protein interactions but also, for the first time, take sample imbalance which is very common in real world and computation efficiency into consideration simultaneously, accelerating the target identification and drug discovery process.

摘要

识别化合物与蛋白质之间的相互作用对于各种应用至关重要,包括药物发现、靶点识别、网络药理学以及蛋白质功能阐释。基于深度神经网络的方法在高效识别具有高通量能力的化合物 - 蛋白质相互作用方面越来越受欢迎,缩小了传统劳动密集型、耗时且昂贵的实验技术的候选范围。在本研究中,我们提出了一种端到端的方法,称为SPVec - SGCN - CPI,该方法利用简化图卷积网络(SGCN)模型,结合我们先前开发的模型SPVec生成的低维连续特征和图拓扑信息来预测化合物 - 蛋白质相互作用。SGCN技术将局部邻域聚合和非线性逐层传播步骤分开,有效地聚合了K阶邻居信息,同时避免了邻居爆炸并加快了训练速度。在三个数据集上评估了SPVec - SGCN - CPI方法的性能,并与四种基于机器学习和深度学习的方法以及六种最先进的方法进行了比较。实验结果表明,SPVec - SGCN - CPI优于所有这些竞争方法,尤其在不平衡数据场景中表现出色。通过将节点特征和拓扑信息传播到特征空间,SPVec - SGCN - CPI有效地纳入了化合物与蛋白质之间的相互作用,实现了异质性的融合。此外,我们的方法对ChEMBL中的所有未标记数据进行了评分,并通过分子对接和现有证据确认了排名前五的化合物 - 蛋白质相互作用。这些发现表明,我们的模型可以可靠地揭示未标记化合物 - 蛋白质对中的化合物 - 蛋白质相互作用,对药物重新定位和发现具有重要意义。总之,SPVec - SGCN在准确预测化合物 - 蛋白质相互作用方面展示了其有效性,显示出增强靶点识别和简化药物发现过程的潜力。

科学贡献

本工作中提出的方法不仅能够相对准确地预测化合物 - 蛋白质相互作用,而且首次同时考虑了现实世界中非常常见的样本不平衡问题和计算效率,加速了靶点识别和药物发现过程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/984af6c7fcdd/13321_2024_862_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/76cf4bbc9592/13321_2024_862_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/3df0dfbcf591/13321_2024_862_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/6be073eeba68/13321_2024_862_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/2133870f8c3d/13321_2024_862_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/4864679f5841/13321_2024_862_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/c1a906e4ab2d/13321_2024_862_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/b72745c0c765/13321_2024_862_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/f911685aeab3/13321_2024_862_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/984af6c7fcdd/13321_2024_862_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/76cf4bbc9592/13321_2024_862_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/3df0dfbcf591/13321_2024_862_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/6be073eeba68/13321_2024_862_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/2133870f8c3d/13321_2024_862_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/4864679f5841/13321_2024_862_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/c1a906e4ab2d/13321_2024_862_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/b72745c0c765/13321_2024_862_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/f911685aeab3/13321_2024_862_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d899/11162000/984af6c7fcdd/13321_2024_862_Fig9_HTML.jpg

相似文献

1
An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model.一种基于简化同构图卷积网络和预训练语言模型预测化合物-蛋白质相互作用的端到端方法。
J Cheminform. 2024 Jun 7;16(1):67. doi: 10.1186/s13321-024-00862-9.
2
SPVec: A Word2vec-Inspired Feature Representation Method for Drug-Target Interaction Prediction.SPVec:一种受词向量启发的用于药物-靶点相互作用预测的特征表示方法。
Front Chem. 2020 Jan 10;7:895. doi: 10.3389/fchem.2019.00895. eCollection 2019.
3
Effectively Identifying Compound-Protein Interaction Using Graph Neural Representation.使用图神经表示法有效识别化合物-蛋白质相互作用
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):932-943. doi: 10.1109/TCBB.2022.3198003. Epub 2023 Apr 3.
4
BACPI: a bi-directional attention neural network for compound-protein interaction and binding affinity prediction.BACPI:一种用于化合物-蛋白质相互作用和结合亲和力预测的双向注意力神经网络。
Bioinformatics. 2022 Mar 28;38(7):1995-2002. doi: 10.1093/bioinformatics/btac035.
5
An inductive graph neural network model for compound-protein interaction prediction based on a homogeneous graph.基于同质图的化合物-蛋白质相互作用预测的递推图神经网络模型。
Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac073.
6
Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences.基于图和序列神经网络端到端学习的化合物-蛋白质相互作用预测。
Bioinformatics. 2019 Jan 15;35(2):309-318. doi: 10.1093/bioinformatics/bty535.
7
MMCL-CPI: A multi-modal compound-protein interaction prediction model incorporating contrastive learning pre-training.MMCL-CPI:一种结合对比学习预训练的多模态化合物-蛋白质相互作用预测模型。
Comput Biol Chem. 2024 Oct;112:108137. doi: 10.1016/j.compbiolchem.2024.108137. Epub 2024 Jul 25.
8
Effectively Identifying Compound-Protein Interactions by Learning from Positive and Unlabeled Examples.通过从正例和无标签样例中学习来有效识别化合物-蛋白质相互作用。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):1832-1843. doi: 10.1109/TCBB.2016.2570211. Epub 2016 May 18.
9
SSGraphCPI: A Novel Model for Predicting Compound-Protein Interactions Based on Deep Learning.SSGraphCPI:一种基于深度学习的新型化合物-蛋白质相互作用预测模型。
Int J Mol Sci. 2022 Mar 29;23(7):3780. doi: 10.3390/ijms23073780.
10
A general prediction model for compound-protein interactions based on deep learning.一种基于深度学习的化合物-蛋白质相互作用通用预测模型。
Front Pharmacol. 2024 Sep 4;15:1465890. doi: 10.3389/fphar.2024.1465890. eCollection 2024.

引用本文的文献

1
Artificial intelligence in traditional Chinese medicine: advances in multi-metabolite multi-target interaction modeling.人工智能在中医领域的应用:多代谢物多靶点相互作用建模的进展
Front Pharmacol. 2025 Apr 15;16:1541509. doi: 10.3389/fphar.2025.1541509. eCollection 2025.
2
GraphBAN: An inductive graph-based approach for enhanced prediction of compound-protein interactions.GraphBAN:一种基于归纳图的方法,用于增强对化合物-蛋白质相互作用的预测。
Nat Commun. 2025 Mar 18;16(1):2541. doi: 10.1038/s41467-025-57536-9.
3
Drug-target interaction prediction with collaborative contrastive learning and adaptive self-paced sampling strategy.

本文引用的文献

1
GraphormerDTI: A graph transformer-based approach for drug-target interaction prediction.GraphormerDTI:一种基于图Transformer 的药物-靶标相互作用预测方法。
Comput Biol Med. 2024 May;173:108339. doi: 10.1016/j.compbiomed.2024.108339. Epub 2024 Mar 18.
2
CCL-DTI: contributing the contrastive loss in drug-target interaction prediction.CCL-DTI:在药物-靶标相互作用预测中引入对比损失。
BMC Bioinformatics. 2024 Jan 30;25(1):48. doi: 10.1186/s12859-024-05671-3.
3
DeepCompoundNet: enhancing compound-protein interaction prediction with multimodal convolutional neural networks.
基于协同对比学习和自适应自步采样策略的药物-靶标相互作用预测。
BMC Biol. 2024 Sep 27;22(1):216. doi: 10.1186/s12915-024-02012-x.
深度化合物网络:利用多模态卷积神经网络增强化合物-蛋白质相互作用预测
J Biomol Struct Dyn. 2025 Feb;43(3):1414-1423. doi: 10.1080/07391102.2023.2291829. Epub 2023 Dec 12.
4
GcForest-based compound-protein interaction prediction model and its application in discovering small-molecule drugs targeting CD47.基于GcForest的化合物-蛋白质相互作用预测模型及其在发现靶向CD47的小分子药物中的应用。
Front Chem. 2023 Oct 20;11:1292869. doi: 10.3389/fchem.2023.1292869. eCollection 2023.
5
Graph convolutional networks: a comprehensive review.图卷积网络:全面综述。
Comput Soc Netw. 2019;6(1):11. doi: 10.1186/s40649-019-0069-y. Epub 2019 Nov 10.
6
Pmf-cpi: assessing drug selectivity with a pretrained multi-functional model for compound-protein interactions.Pmf-cpi:使用预训练的多功能化合物-蛋白质相互作用模型评估药物选择性。
J Cheminform. 2023 Oct 14;15(1):97. doi: 10.1186/s13321-023-00767-z.
7
A Robust Drug-Target Interaction Prediction Framework with Capsule Network and Transfer Learning.一种基于胶囊网络和迁移学习的稳健药物-靶点相互作用预测框架。
Int J Mol Sci. 2023 Sep 14;24(18):14061. doi: 10.3390/ijms241814061.
8
Improving the generalizability of protein-ligand binding predictions with AI-Bind.利用 AI-Bind 提高蛋白质 - 配体结合预测的泛化能力
Nat Commun. 2023 Apr 8;14(1):1989. doi: 10.1038/s41467-023-37572-z.
9
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
10
ProtGPT2 is a deep unsupervised language model for protein design.ProtGPT2 是一个用于蛋白质设计的深度无监督语言模型。
Nat Commun. 2022 Jul 27;13(1):4348. doi: 10.1038/s41467-022-32007-7.