Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University Poznan, Uniwersytetu Poznanskiego 6, 61-614, Poznan, Poland.
Molecular Virology Research Unit, Faculty of Biology, Adam Mickiewicz University Poznan, Uniwersytetu Poznanskiego 6, 61-614, Poznan, Poland.
BMC Biol. 2021 Oct 8;19(1):223. doi: 10.1186/s12915-021-01146-6.
Characterizing phage-host interactions is critical to understanding the ecological role of both partners and effective isolation of phage therapeuticals. Unfortunately, experimental methods for studying these interactions are markedly slow, low-throughput, and unsuitable for phages or hosts difficult to maintain in laboratory conditions. Therefore, a number of in silico methods emerged to predict prokaryotic hosts based on viral sequences. One of the leading approaches is the application of the BLAST tool that searches for local similarities between viral and microbial genomes. However, this prediction method has three major limitations: (i) top-scoring sequences do not always point to the actual host; (ii) mosaic virus genomes may match to many, typically related, bacteria; and (iii) viral and host sequences may diverge beyond the point where their relationship can be detected by a BLAST alignment.
We created an extension to BLAST, named Phirbo, that improves host prediction quality beyond what is obtainable from standard BLAST searches. The tool harnesses information concerning sequence similarity and bacteria relatedness to predict phage-host interactions. Phirbo was evaluated on three benchmark sets of known virus-host pairs, and it improved precision and recall by 11-40 percentage points over currently available, state-of-the-art, alignment-based, alignment-free, and machine-learning host prediction tools. Moreover, the discriminatory power of Phirbo for the recognition of virus-host relationships surpassed the results of other tools by at least 10 percentage points (area under the curve = 0.95), yielding a mean host prediction accuracy of 57% and 68% at the genus and family levels, respectively, and drops by 12 percentage points when using only a fraction of viral genome sequences (3 kb). Finally, we provide insights into a repertoire of protein and ncRNA genes that are shared between phages and hosts and may be prone to horizontal transfer during infection.
Our results suggest that Phirbo is a simple and effective tool for predicting phage-host relationships.
研究噬菌体-宿主相互作用对于理解两者的生态角色以及有效分离噬菌体治疗剂至关重要。然而,研究这些相互作用的实验方法明显缓慢、低通量,并且不适合难以在实验室条件下维持的噬菌体或宿主。因此,出现了许多基于病毒序列预测原核宿主的计算方法。其中一种主要方法是应用 BLAST 工具,该工具用于在病毒和微生物基因组之间搜索局部相似性。然而,这种预测方法存在三个主要局限性:(i) 最高得分序列并不总是指向实际宿主;(ii) 嵌合病毒基因组可能与许多通常相关的细菌匹配;(iii) 病毒和宿主序列可能在其关系无法通过 BLAST 比对检测的程度上发生分歧。
我们创建了 BLAST 的扩展版本,命名为 Phirbo,它可以提高宿主预测的质量,超越标准 BLAST 搜索所能达到的水平。该工具利用有关序列相似性和细菌亲缘关系的信息来预测噬菌体-宿主相互作用。Phirbo 在三个已知病毒-宿主对基准集上进行了评估,与现有的、最先进的基于比对、无比对和机器学习的宿主预测工具相比,它提高了 11-40 个百分点的精度和召回率。此外,Phirbo 识别病毒-宿主关系的辨别力至少比其他工具高出 10 个百分点(曲线下面积 = 0.95),在属和科水平上的平均宿主预测准确性分别为 57%和 68%,当仅使用病毒基因组序列的一部分(3 kb)时,该值下降 12 个百分点。最后,我们提供了关于噬菌体和宿主之间共享的蛋白质和 ncRNA 基因的见解,这些基因可能在感染过程中易于水平转移。
我们的结果表明,Phirbo 是一种简单有效的预测噬菌体-宿主关系的工具。