• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过基于序列嵌入的机器学习方法预测人类与病毒的蛋白质-蛋白质相互作用。

Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method.

作者信息

Yang Xiaodi, Yang Shiping, Li Qinmengge, Wuchty Stefan, Zhang Ziding

机构信息

State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China.

State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China.

出版信息

Comput Struct Biotechnol J. 2019 Dec 26;18:153-161. doi: 10.1016/j.csbj.2019.12.005. eCollection 2020.

DOI:10.1016/j.csbj.2019.12.005
PMID:31969974
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6961065/
Abstract

The identification of human-virus protein-protein interactions (PPIs) is an essential and challenging research topic, potentially providing a mechanistic understanding of viral infection. Given that the experimental determination of human-virus PPIs is time-consuming and labor-intensive, computational methods are playing an important role in providing testable hypotheses, complementing the determination of large-scale interactome between species. In this work, we applied an unsupervised sequence embedding technique (doc2vec) to represent protein sequences as rich feature vectors of low dimensionality. Training a Random Forest (RF) classifier through a training dataset that covers known PPIs between human and all viruses, we obtained excellent predictive accuracy outperforming various combinations of machine learning algorithms and commonly-used sequence encoding schemes. Rigorous comparison with three existing human-virus PPI prediction methods, our proposed computational framework further provided very competitive and promising performance, suggesting that the doc2vec encoding scheme effectively captures context information of protein sequences, pertaining to corresponding protein-protein interactions. Our approach is freely accessible through our web server as part of our host-pathogen PPI prediction platform (http://zzdlab.com/InterSPPI/). Taken together, we hope the current work not only contributes a useful predictor to accelerate the exploration of human-virus PPIs, but also provides some meaningful insights into human-virus relationships.

摘要

鉴定人类与病毒的蛋白质-蛋白质相互作用(PPI)是一个至关重要且具有挑战性的研究课题,它有可能为病毒感染提供机制性的理解。鉴于通过实验确定人类与病毒的PPI既耗时又费力,计算方法在提供可测试的假设方面发挥着重要作用,对物种间大规模相互作用组的确定起到补充作用。在这项工作中,我们应用了一种无监督序列嵌入技术(doc2vec),将蛋白质序列表示为低维的丰富特征向量。通过一个涵盖人类与所有病毒之间已知PPI的训练数据集训练随机森林(RF)分类器,我们获得了出色的预测准确率,优于各种机器学习算法和常用序列编码方案的组合。与三种现有的人类与病毒PPI预测方法进行严格比较,我们提出的计算框架进一步展现出极具竞争力和前景的性能,这表明doc2vec编码方案有效地捕捉了与相应蛋白质-蛋白质相互作用相关的蛋白质序列上下文信息。我们的方法可通过我们的网络服务器免费获取,作为我们宿主-病原体PPI预测平台(http://zzdlab.com/InterSPPI/)的一部分。综上所述,我们希望当前的工作不仅能为加速人类与病毒PPI的探索贡献一个有用的预测工具,还能为人类与病毒的关系提供一些有意义的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/b1f69b3abbc3/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/ab3f6665bcae/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/b63d32a8e0cc/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/485d2d8842d8/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/80bb64944028/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/b1f69b3abbc3/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/ab3f6665bcae/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/b63d32a8e0cc/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/485d2d8842d8/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/80bb64944028/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1567/6961065/b1f69b3abbc3/gr4.jpg

相似文献

1
Prediction of human-virus protein-protein interactions through a sequence embedding-based machine learning method.通过基于序列嵌入的机器学习方法预测人类与病毒的蛋白质-蛋白质相互作用。
Comput Struct Biotechnol J. 2019 Dec 26;18:153-161. doi: 10.1016/j.csbj.2019.12.005. eCollection 2020.
2
Cross-attention PHV: Prediction of human and virus protein-protein interactions using cross-attention-based neural networks.交叉注意力PHV:使用基于交叉注意力的神经网络预测人类与病毒的蛋白质-蛋白质相互作用。
Comput Struct Biotechnol J. 2022;20:5564-5573. doi: 10.1016/j.csbj.2022.10.012. Epub 2022 Oct 8.
3
Proteome-wide prediction and analysis of the protein-protein interaction network through integrative methods.通过综合方法对蛋白质-蛋白质相互作用网络进行全蛋白质组范围的预测与分析。
Comput Struct Biotechnol J. 2022 May 13;20:2322-2331. doi: 10.1016/j.csbj.2022.05.017. eCollection 2022.
4
Critical assessment and performance improvement of plant-pathogen protein-protein interaction prediction methods.植物-病原体蛋白-蛋白相互作用预测方法的关键评估和性能改进。
Brief Bioinform. 2019 Jan 18;20(1):274-287. doi: 10.1093/bib/bbx123.
5
Machine-Learning-Based Predictor of Human-Bacteria Protein-Protein Interactions by Incorporating Comprehensive Host-Network Properties.基于机器学习的方法,通过整合全面的宿主网络特性来预测人类与细菌的蛋白质-蛋白质相互作用。
J Proteome Res. 2019 May 3;18(5):2195-2205. doi: 10.1021/acs.jproteome.9b00074. Epub 2019 Apr 22.
6
POOE: predicting oomycete effectors based on a pre-trained large protein language model.POOE:基于预先训练的大型蛋白质语言模型预测卵菌效应子。
mSystems. 2024 Jan 23;9(1):e0100423. doi: 10.1128/msystems.01004-23. Epub 2023 Dec 11.
7
Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation.基于进化矩阵表示的随机蕨类预测蛋白质-蛋白质相互作用。
Comput Math Methods Med. 2022 Feb 22;2022:7191684. doi: 10.1155/2022/7191684. eCollection 2022.
8
LGCA-VHPPI: A local-global residue context aware viral-host protein-protein interaction predictor.LGCA-VHPPI:一种基于局部-全局残基上下文感知的病毒-宿主蛋白-蛋白相互作用预测器。
PLoS One. 2022 Jul 5;17(7):e0270275. doi: 10.1371/journal.pone.0270275. eCollection 2022.
9
Predicting protein-protein interactions between human and hepatitis C virus via an ensemble learning method.通过集成学习方法预测人类与丙型肝炎病毒之间的蛋白质-蛋白质相互作用。
Mol Biosyst. 2014 Dec;10(12):3147-54. doi: 10.1039/c4mb00410h. Epub 2014 Sep 18.
10
Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.使用一种新颖的多尺度局部特征表示方案和随机森林从蛋白质一级序列预测蛋白质-蛋白质相互作用。
PLoS One. 2015 May 6;10(5):e0125811. doi: 10.1371/journal.pone.0125811. eCollection 2015.

引用本文的文献

1
Graph neural network integrated with pretrained protein language model for predicting human-virus protein-protein interactions.结合预训练蛋白质语言模型的图神经网络用于预测人-病毒蛋白质-蛋白质相互作用
Brief Bioinform. 2025 Aug 31;26(5). doi: 10.1093/bib/bbaf461.
2
Computational Analysis of Virus-Host Interactomes.病毒-宿主相互作用组的计算分析
Methods Mol Biol. 2025;2940:79-91. doi: 10.1007/978-1-0716-4615-1_8.
3
Identification of RC3H1 as antiviral host factor binding to the non-structural protein 1 of Influenza A virus via a 3-stage computational pipeline and cell-based analysis.

本文引用的文献

1
Modeling aspects of the language of life through transfer-learning protein sequences.通过转移学习蛋白质序列来模拟生命语言的各个方面。
BMC Bioinformatics. 2019 Dec 17;20(1):723. doi: 10.1186/s12859-019-3220-8.
2
A Structure-Informed Atlas of Human-Virus Interactions.一种基于结构信息的人类-病毒相互作用图谱。
Cell. 2019 Sep 5;178(6):1526-1541.e16. doi: 10.1016/j.cell.2019.08.005. Epub 2019 Aug 29.
3
Understanding Human-Virus Protein-Protein Interactions Using a Human Protein Complex-Based Analysis Framework.使用基于人类蛋白质复合物的分析框架理解人类-病毒蛋白质-蛋白质相互作用
通过三阶段计算流程和基于细胞的分析鉴定RC3H1作为与甲型流感病毒非结构蛋白1结合的抗病毒宿主因子。
Virol J. 2025 Apr 26;22(1):119. doi: 10.1186/s12985-025-02746-2.
4
VHI-Pred: A Multi-Feature-Based Tool for Predicting Human-Virus Protein-Protein Interactions.VHI-Pred:一种基于多特征的人类病毒蛋白质-蛋白质相互作用预测工具。
Mol Biotechnol. 2025 Apr 5. doi: 10.1007/s12033-025-01417-5.
5
Prediction of influenza A virus-human protein-protein interactions using XGBoost with continuous and discontinuous amino acids information.使用具有连续和不连续氨基酸信息的XGBoost预测甲型流感病毒与人的蛋白质-蛋白质相互作用
PeerJ. 2025 Jan 30;13:e18863. doi: 10.7717/peerj.18863. eCollection 2025.
6
Nanobody screening and machine learning guided identification of cross-variant anti-SARS-CoV-2 neutralizing heavy-chain only antibodies.纳米抗体筛选及机器学习辅助鉴定针对新冠病毒变异株的仅重链抗SARS-CoV-2中和抗体
PLoS Pathog. 2025 Jan 23;21(1):e1012903. doi: 10.1371/journal.ppat.1012903. eCollection 2025 Jan.
7
HBFormer: a single-stream framework based on hybrid attention mechanism for identification of human-virus protein-protein interactions.HBFormer:一种基于混合注意力机制的单流框架,用于识别人类-病毒蛋白质-蛋白质相互作用。
Bioinformatics. 2024 Nov 28;40(12). doi: 10.1093/bioinformatics/btae724.
8
Prediction of viral oncoproteins through the combination of generative adversarial networks and machine learning techniques.通过生成对抗网络和机器学习技术的结合来预测病毒致癌蛋白。
Sci Rep. 2024 Nov 7;14(1):27108. doi: 10.1038/s41598-024-77028-y.
9
Bioinformatic Resources for Exploring Human-virus Protein-protein Interactions Based on Binding Modes.基于结合模式探索人类-病毒蛋白质-蛋白质相互作用的生物信息学资源
Genomics Proteomics Bioinformatics. 2024 Dec 3;22(5). doi: 10.1093/gpbjnl/qzae075.
10
Prediction of Protein-Protein Interactions Based on Integrating Deep Learning and Feature Fusion.基于深度学习和特征融合的蛋白质-蛋白质相互作用预测。
Int J Mol Sci. 2024 May 27;25(11):5820. doi: 10.3390/ijms25115820.
mSystems. 2019 Apr 9;4(2). doi: 10.1128/mSystems.00303-18. eCollection 2019 Mar-Apr.
4
Machine-Learning-Based Predictor of Human-Bacteria Protein-Protein Interactions by Incorporating Comprehensive Host-Network Properties.基于机器学习的方法,通过整合全面的宿主网络特性来预测人类与细菌的蛋白质-蛋白质相互作用。
J Proteome Res. 2019 May 3;18(5):2195-2205. doi: 10.1021/acs.jproteome.9b00074. Epub 2019 Apr 22.
5
Predicting protein-protein interactions through sequence-based deep learning.基于序列的深度学习预测蛋白质-蛋白质相互作用。
Bioinformatics. 2018 Sep 1;34(17):i802-i810. doi: 10.1093/bioinformatics/bty573.
6
Prediction of human-Bacillus anthracis protein-protein interactions using multi-layer neural network.使用多层神经网络预测人类-炭疽杆菌蛋白质-蛋白质相互作用。
Bioinformatics. 2018 Dec 15;34(24):4159-4164. doi: 10.1093/bioinformatics/bty504.
7
Predicting Interactions between Virus and Host Proteins Using Repeat Patterns and Composition of Amino Acids.利用重复模式和氨基酸组成预测病毒和宿主蛋白之间的相互作用。
J Healthc Eng. 2018 May 9;2018:1391265. doi: 10.1155/2018/1391265. eCollection 2018.
8
Learned protein embeddings for machine learning.机器学习的深度学习蛋白质嵌入。
Bioinformatics. 2018 Aug 1;34(15):2642-2648. doi: 10.1093/bioinformatics/bty178.
9
Critical assessment and performance improvement of plant-pathogen protein-protein interaction prediction methods.植物-病原体蛋白-蛋白相互作用预测方法的关键评估和性能改进。
Brief Bioinform. 2019 Jan 18;20(1):274-287. doi: 10.1093/bib/bbx123.
10
In Search of Lost Small Peptides.寻找丢失的小肽。
Annu Rev Cell Dev Biol. 2017 Oct 6;33:391-416. doi: 10.1146/annurev-cellbio-100616-060516. Epub 2017 Jul 31.