Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong (SAR), China.
Department of Information Engineering, The Chinese University of Hong Kong, New Territories, Hong Kong (SAR), China.
Mol Biol Evol. 2024 Aug 2;41(8). doi: 10.1093/molbev/msae171.
Segmented RNA viruses are a complex group of RNA viruses with multisegment genomes. Reconstructing complete segmented viruses is crucial for advancing our understanding of viral diversity, evolution, and public health impact. Using metatranscriptomic data to identify known and novel segmented viruses has sped up the survey of segmented viruses in various ecosystems. However, the high genetic diversity and the difficulty in binning complete segmented genomes present significant challenges in segmented virus reconstruction. Current virus detection tools are primarily used to identify nonsegmented viral genomes. This study presents SegVir, a novel tool designed to identify segmented RNA viruses and reconstruct their complete genomes from complex metatranscriptomes. SegVir leverages both close and remote homology searches to accurately detect conserved and divergent viral segments. Additionally, we introduce a new method that can evaluate the genome completeness and conservation based on gene content. Our evaluations on simulated datasets demonstrate SegVir's superior sensitivity and precision compared to existing tools. Moreover, in experiments using real data, we identified some virus segments missing in the NCBI database, underscoring SegVir's potential to enhance viral metagenome analysis. The source code and supporting data of SegVir are available via https://github.com/HubertTang/SegVir.
分段 RNA 病毒是一组具有多节段基因组的复杂 RNA 病毒。重建完整的分段病毒对于增进我们对病毒多样性、进化和公共卫生影响的理解至关重要。使用宏转录组数据来识别已知和新型分段病毒,加快了对各种生态系统中分段病毒的调查。然而,高遗传多样性和完整分段基因组的分箱困难给分段病毒的重建带来了重大挑战。目前的病毒检测工具主要用于识别非分段病毒基因组。本研究提出了 SegVir,这是一种从复杂宏转录组中识别分段 RNA 病毒并重建其完整基因组的新型工具。SegVir 利用近距离和远距离同源搜索来准确检测保守和发散的病毒片段。此外,我们引入了一种新的方法,可以根据基因内容评估基因组的完整性和保守性。我们在模拟数据集上的评估表明,SegVir 的灵敏度和精度优于现有工具。此外,在使用真实数据的实验中,我们发现了 NCBI 数据库中缺失的一些病毒片段,这突显了 SegVir 增强病毒宏基因组分析的潜力。SegVir 的源代码和支持数据可通过 https://github.com/HubertTang/SegVir 获得。