Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University in Kraków, ul. Gronostajowa 7, 30-387, Kraków, Poland.
AGH University of Science and Technology, Faculty of Materials Science and Ceramics, al. Mickiewicza 30, 30-059, Kraków, Poland.
Sci Rep. 2019 Mar 5;9(1):3436. doi: 10.1038/s41598-019-39847-2.
Recent advances in metagenomics provided a valuable alternative to culture-based approaches for better sampling viral diversity. However, some of newly identified viruses lack sequence similarity to any of previously sequenced ones, and cannot be easily assigned to their hosts. Here we present a bioinformatic approach to this problem. We developed classifiers capable of distinguishing eukaryotic viruses from the phages achieving almost 95% prediction accuracy. The classifiers are wrapped in Host Taxon Predictor (HTP) software written in Python which is freely available at https://github.com/wojciech-galan/viruses_classifier . HTP's performance was later demonstrated on a collection of newly identified viral genomes and genome fragments. In summary, HTP is a culture- and alignment-free approach for distinction between phages and eukaryotic viruses. We have also shown that it is possible to further extend our method to go up the evolutionary tree and predict whether a virus can infect narrower taxa.
宏基因组学的最新进展为更好地采样病毒多样性提供了一种有价值的替代培养方法。然而,一些新发现的病毒与以前测序的任何病毒都没有序列相似性,并且不容易将其分配给宿主。在这里,我们提出了一种解决这个问题的生物信息学方法。我们开发了能够区分真核病毒和噬菌体的分类器,实现了近 95%的预测准确性。分类器被包装在 Python 编写的宿主分类器(Host Taxon Predictor,HTP)软件中,可在 https://github.com/wojciech-galan/viruses_classifier 上免费获得。HTP 的性能后来在一组新鉴定的病毒基因组和基因组片段上得到了验证。总之,HTP 是一种无需培养和对齐即可区分噬菌体和真核病毒的方法。我们还表明,有可能进一步扩展我们的方法,沿着进化树前进,并预测病毒是否可以感染更窄的分类群。