• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用蛋白质语言模型和蛋白质网络特征改进蛋白质-蛋白质相互作用预测。

Improving protein-protein interaction prediction using protein language model and protein network features.

机构信息

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.

College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China.

出版信息

Anal Biochem. 2024 Oct;693:115550. doi: 10.1016/j.ab.2024.115550. Epub 2024 Apr 26.

DOI:10.1016/j.ab.2024.115550
PMID:38679191
Abstract

Interactions between proteins are ubiquitous in a wide variety of biological processes. Accurately identifying the protein-protein interaction (PPI) is of significant importance for understanding the mechanisms of protein functions and facilitating drug discovery. Although the wet-lab technological methods are the best way to identify PPI, their major constraints are their time-consuming nature, high cost, and labor-intensiveness. Hence, lots of efforts have been made towards developing computational methods to improve the performance of PPI prediction. In this study, we propose a novel hybrid computational method (called KSGPPI) that aims at improving the prediction performance of PPI via extracting the discriminative information from protein sequences and interaction networks. The KSGPPI model comprises two feature extraction modules. In the first feature extraction module, a large protein language model, ESM-2, is employed to exploit the global complex patterns concealed within protein sequences. Subsequently, feature representations are further extracted through CKSAAP, and a two-dimensional convolutional neural network (CNN) is utilized to capture local information. In the second feature extraction module, the query protein acquires its similar protein from the STRING database via the sequence alignment tool NW-align and then captures the graph embedding feature for the query protein in the protein interaction network of the similar protein using the algorithm of Node2vec. Finally, the features of these two feature extraction modules are efficiently fused; the fused features are then fed into the multilayer perceptron to predict PPI. The results of five-fold cross-validation on the used benchmarked datasets demonstrate that KSGPPI achieves an average prediction accuracy of 88.96 %. Additionally, the average Matthews correlation coefficient value (0.781) of KSGPPI is significantly higher than that of those state-of-the-art PPI prediction methods. The standalone package of KSGPPI is freely downloaded at https://github.com/rickleezhe/KSGPPI.

摘要

蛋白质之间的相互作用在广泛的生物过程中无处不在。准确识别蛋白质-蛋白质相互作用(PPI)对于理解蛋白质功能的机制和促进药物发现具有重要意义。虽然湿实验室技术方法是识别 PPI 的最佳方法,但它们的主要限制是耗时、成本高和劳动强度大。因此,人们已经做出了大量努力来开发计算方法以提高 PPI 预测的性能。在这项研究中,我们提出了一种新的混合计算方法(称为 KSGPPI),旨在通过从蛋白质序列和相互作用网络中提取有区别的信息来提高 PPI 的预测性能。KSGPPI 模型包括两个特征提取模块。在第一个特征提取模块中,使用大型蛋白质语言模型 ESM-2 来利用隐藏在蛋白质序列中的全局复杂模式。随后,通过 CKSAAP 进一步提取特征表示,并使用二维卷积神经网络(CNN)捕获局部信息。在第二个特征提取模块中,查询蛋白质通过序列比对工具 NW-align 从 STRING 数据库中获取其相似蛋白质,然后使用 Node2vec 算法捕获相似蛋白质的蛋白质相互作用网络中查询蛋白质的图嵌入特征。最后,有效地融合这两个特征提取模块的特征;融合后的特征被送入多层感知机中以预测 PPI。在使用的基准数据集上进行的五重交叉验证的结果表明,KSGPPI 的平均预测准确率为 88.96%。此外,KSGPPI 的平均马修斯相关系数值(0.781)明显高于那些最先进的 PPI 预测方法。KSGPPI 的独立软件包可在 https://github.com/rickleezhe/KSGPPI 上免费下载。

相似文献

1
Improving protein-protein interaction prediction using protein language model and protein network features.利用蛋白质语言模型和蛋白质网络特征改进蛋白质-蛋白质相互作用预测。
Anal Biochem. 2024 Oct;693:115550. doi: 10.1016/j.ab.2024.115550. Epub 2024 Apr 26.
2
DSSGNN-PPI: A Protein-Protein Interactions prediction model based on Double Structure and Sequence graph neural networks.DSSGNN-PPI:一种基于双结构和序列图神经网络的蛋白质-蛋白质相互作用预测模型。
Comput Biol Med. 2024 Jul;177:108669. doi: 10.1016/j.compbiomed.2024.108669. Epub 2024 May 29.
3
DL-PPI: a method on prediction of sequenced protein-protein interaction based on deep learning.DL-PPI:一种基于深度学习的预测序列蛋白质相互作用的方法。
BMC Bioinformatics. 2023 Dec 14;24(1):473. doi: 10.1186/s12859-023-05594-5.
4
HN-PPISP: a hybrid network based on MLP-Mixer for protein-protein interaction site prediction.HN-PPISP:一种基于MLP-Mixer的用于蛋白质-蛋白质相互作用位点预测的混合网络。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac480.
5
GNNGL-PPI: multi-category prediction of protein-protein interactions using graph neural networks based on global graphs and local subgraphs.GNNGL-PPI:基于全局图和局部子图的图神经网络的蛋白质-蛋白质相互作用多类别预测。
BMC Genomics. 2024 May 9;25(1):406. doi: 10.1186/s12864-024-10299-x.
6
DualNetGO: a dual network model for protein function prediction via effective feature selection.DualNetGO:一种通过有效特征选择进行蛋白质功能预测的双网络模型。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae437.
7
Improving protein-protein interaction site prediction using deep residual neural network.利用深度残差神经网络提高蛋白质-蛋白质相互作用位点预测
Anal Biochem. 2023 Jun 1;670:115132. doi: 10.1016/j.ab.2023.115132. Epub 2023 Mar 28.
8
Graph embedding-based novel protein interaction prediction via higher-order graph convolutional network.基于图嵌入的高阶图卷积网络新型蛋白质相互作用预测。
PLoS One. 2020 Sep 24;15(9):e0238915. doi: 10.1371/journal.pone.0238915. eCollection 2020.
9
Protein features fusion using attributed network embedding for predicting protein-protein interaction.使用属性网络嵌入进行蛋白质特征融合,以预测蛋白质-蛋白质相互作用。
BMC Genomics. 2024 May 13;25(1):466. doi: 10.1186/s12864-024-10361-8.
10
LePrimAlign: local entropy-based alignment of PPI networks to predict conserved modules.LePrimAlign:基于局部信息熵的蛋白质相互作用网络比对方法,用于预测保守模块。
BMC Genomics. 2019 Dec 24;20(Suppl 9):964. doi: 10.1186/s12864-019-6271-3.

引用本文的文献

1
A Survey of Pretrained Protein Language Models.预训练蛋白质语言模型综述
Methods Mol Biol. 2025;2941:1-29. doi: 10.1007/978-1-0716-4623-6_1.
2
Recent advances in deep learning for protein-protein interaction: a review.深度学习在蛋白质-蛋白质相互作用研究中的最新进展:综述
BioData Min. 2025 Jun 16;18(1):43. doi: 10.1186/s13040-025-00457-6.
3
Synthesis and Evaluation of Isosteviol Derivatives: Promising Anticancer Therapies for Colon Cancer.异甜菊醇衍生物的合成与评价:结肠癌的潜在抗癌疗法
Biomedicines. 2025 Mar 25;13(4):793. doi: 10.3390/biomedicines13040793.
4
scPRINT: pre-training on 50 million cells allows robust gene network predictions.scPRINT:在5000万个细胞上进行预训练可实现强大的基因网络预测。
Nat Commun. 2025 Apr 16;16(1):3607. doi: 10.1038/s41467-025-58699-1.
5
pNPs-CapsNet: Predicting Neuropeptides Using Protein Language Models and FastText Encoding-Based Weighted Multi-View Feature Integration with Deep Capsule Neural Network.pNPs-CapsNet:使用蛋白质语言模型和基于FastText编码的加权多视图特征集成与深度胶囊神经网络预测神经肽
ACS Omega. 2025 Mar 18;10(12):12403-12416. doi: 10.1021/acsomega.4c11449. eCollection 2025 Apr 1.
6
Improving Identification of Drug-Target Binding Sites Based on Structures of Targets Using Residual Graph Transformer Network.基于靶点结构利用残差图变换器网络改进药物-靶点结合位点的识别
Biomolecules. 2025 Feb 3;15(2):221. doi: 10.3390/biom15020221.
7
PAPreC: A Pipeline for Antigenicity Prediction Comparison Methods across Bacteria.PAPreC:一种用于比较细菌抗原性预测方法的流程
ACS Omega. 2025 Feb 3;10(6):5415-5429. doi: 10.1021/acsomega.4c07147. eCollection 2025 Feb 18.
8
TargetCLP: clathrin proteins prediction combining transformed and evolutionary scale modeling-based multi-view features via weighted feature integration approach.TargetCLP:通过加权特征整合方法结合基于变换和进化尺度建模的多视图特征进行网格蛋白蛋白质预测。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf026.