Coutinho Felipe Hernandes, Zaragoza-Solas Asier, López-Pérez Mario, Barylski Jakub, Zielezinski Andrzej, Dutilh Bas E, Edwards Robert, Rodriguez-Valera Francisco
Evolutionary Genomics Group, Departamento de Producción Vegetal y Microbiología, Universidad Miguel Hernández, Aptdo. 18., Ctra. Alicante-Valencia N-332, s/n, San Juan de Alicante, 03550 Alicante, Spain.
Molecular Virology Research Unit, Faculty of Biology, Adam Mickiewicz University Poznan, 61-614 Poznan, Poland.
Patterns (N Y). 2021 Jun 15;2(7):100274. doi: 10.1016/j.patter.2021.100274. eCollection 2021 Jul 9.
Culture-independent approaches have recently shed light on the genomic diversity of viruses of prokaryotes. One fundamental question when trying to understand their ecological roles is: which host do they infect? To tackle this issue we developed a machine-learning approach named Random Forest Assignment of Hosts (RaFAH), that uses scores to 43,644 protein clusters to assign hosts to complete or fragmented genomes of viruses of Archaea and Bacteria. RaFAH displayed performance comparable with that of other methods for virus-host prediction in three different benchmarks encompassing viruses from RefSeq, single amplified genomes, and metagenomes. RaFAH was applied to assembled metagenomic datasets of uncultured viruses from eight different biomes of medical, biotechnological, and environmental relevance. Our analyses led to the identification of 537 sequences of archaeal viruses representing unknown lineages, whose genomes encode novel auxiliary metabolic genes, shedding light on how these viruses interfere with the host molecular machinery. RaFAH is available at https://sourceforge.net/projects/rafah/.
不依赖培养的方法最近揭示了原核生物病毒的基因组多样性。在试图理解它们的生态作用时,一个基本问题是:它们感染哪种宿主?为了解决这个问题,我们开发了一种名为宿主随机森林分配(RaFAH)的机器学习方法,该方法使用针对43,644个蛋白质簇的得分,将宿主分配给古菌和细菌病毒的完整或片段化基因组。在涵盖来自RefSeq的病毒、单扩增基因组和宏基因组的三个不同基准测试中,RaFAH的表现与其他病毒-宿主预测方法相当。RaFAH被应用于来自八个具有医学、生物技术和环境相关性的不同生物群落的未培养病毒的组装宏基因组数据集。我们的分析导致鉴定出537个代表未知谱系的古菌病毒序列,其基因组编码新的辅助代谢基因,揭示了这些病毒如何干扰宿主分子机制。RaFAH可在https://sourceforge.net/projects/rafah/获取。