Rahmatbakhsh Matineh, Moutaoufik Mohamed Taha, Gagarinova Alla, Babu Mohan
Department of Biochemistry, University of Regina, Regina, SK S4S 0A2, Canada.
Department of Biochemistry, University of Saskatchewan, Saskatoon, SK S7N 5E5, Canada.
Bioinform Adv. 2022 May 23;2(1):vbac038. doi: 10.1093/bioadv/vbac038. eCollection 2022.
Despite arduous and time-consuming experimental efforts, protein-protein interactions (PPIs) for many pathogenic microbes with their human host are still unknown, limiting our understanding of the intricate interactions during infection and the identification of therapeutic targets. Since computational tools offer a promising alternative, we developed an R/Bioconductor package, HPiP (Host-Pathogen Interaction Prediction) software with a series of amino acid sequence property descriptors and an ensemble machine learning classifiers to predict the yet unmapped interactions between pathogen and host proteins.
Using severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) or the novel SARS-CoV-2 coronavirus-human PPI training sets as a case study, we show that HPiP achieves a good performance with PPI predictions between SARS-CoV-2 and human proteins, which we confirmed experimentally in human monocyte THP-1 cells, and with several quality control metrics. HPiP also exhibited strong performance in accurately predicting the previously reported PPIs when tested against the sequences of pathogenic bacteria, and human proteins. Collectively, our fully documented HPiP software will hasten the exploration of PPIs for a systems-level understanding of many understudied pathogens and uncover molecular targets for repurposing existing drugs.
HPiP is released as an open-source code under the MIT license that is freely available on GitHub (https://github.com/BabuLab-UofR/HPiP) as well as on Bioconductor (http://bioconductor.org/packages/devel/bioc/html/HPiP.html).
Supplementary data are available at online.
尽管进行了艰巨且耗时的实验工作,但许多致病微生物与人类宿主之间的蛋白质-蛋白质相互作用(PPI)仍不为人知,这限制了我们对感染过程中复杂相互作用的理解以及治疗靶点的识别。由于计算工具提供了一种有前景的替代方法,我们开发了一个R/Bioconductor软件包HPiP(宿主-病原体相互作用预测),它带有一系列氨基酸序列属性描述符和一个集成机器学习分类器,用于预测病原体与宿主蛋白质之间尚未明确的相互作用。
以严重急性呼吸综合征冠状病毒1(SARS-CoV-1)或新型SARS-CoV-2冠状病毒与人类的PPI训练集为例进行研究,我们发现HPiP在预测SARS-CoV-2与人类蛋白质之间的PPI时表现良好,我们在人类单核细胞THP-1细胞中通过实验证实了这一点,并使用了多个质量控制指标。当针对致病细菌和人类蛋白质的序列进行测试时,HPiP在准确预测先前报道的PPI方面也表现出强大的性能。总体而言,我们有完整文档记录的HPiP软件将加速对PPI的探索,以便从系统层面理解许多研究不足的病原体,并发现用于重新利用现有药物的分子靶点。
HPiP以开源代码形式发布,遵循MIT许可,可在GitHub(https://github.com/BabuLab-UofR/HPiP)以及Bioconductor(http://bioconductor.org/packages/devel/bioc/html/HPiP.html)上免费获取。
补充数据可在网上获取。