Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Japan.
Laboratoire de Physiologie Cellulaire and Végétale, CEA, University Grenoble Alpes, CNRS, INRA, IRIG, Grenoble, France.
mSphere. 2021 Apr 21;6(2):e01298-20. doi: 10.1128/mSphere.01298-20.
Nucleocytoplasmic large DNA viruses (NCLDVs) are highly diverse and abundant in marine environments. However, the knowledge of their hosts is limited because only a few NCLDVs have been isolated so far. Taking advantage of the recent large-scale marine metagenomics census, host prediction approaches are expected to fill the gap and further expand our knowledge of virus-host relationships for unknown NCLDVs. In this study, we built co-occurrence networks of NCLDVs and eukaryotic taxa to predict virus-host interactions using Oceans sequencing data. Using the positive likelihood ratio to assess the performance of host prediction for NCLDVs, we benchmarked several co-occurrence approaches and demonstrated an increase in the odds ratio of predicting true positive relationships 4-fold compared to random host predictions. To further refine host predictions from high-dimensional co-occurrence networks, we developed a phylogeny-informed filtering method, Taxon Interaction Mapper, and showed it further improved the prediction performance by 12-fold. Finally, we inferred virophage-NCLDV networks to corroborate that co-occurrence approaches are effective for predicting interacting partners of NCLDVs in marine environments. NCLDVs can infect a wide range of eukaryotes, although their life cycle is less dependent on hosts compared to other viruses. However, our understanding of NCLDV-host systems is highly limited because few of these viruses have been isolated so far. Co-occurrence information has been assumed to be useful to predict virus-host interactions. In this study, we quantitatively show the effectiveness of co-occurrence inference for NCLDV host prediction. We also improve the prediction performance with a phylogeny-guided method, which leads to a concise list of candidate host lineages for three NCLDV families. Our results underpin the usage of co-occurrence approaches for the metagenomic exploration of the ecology of this diverse group of viruses.
海洋环境中富含大量多样的核质大 DNA 病毒(NCLDV)。然而,由于迄今为止仅分离出少数几种 NCLDV,因此对其宿主的了解有限。利用最近大规模的海洋宏基因组普查,宿主预测方法有望填补这一空白,并进一步扩展我们对未知 NCLDV 的病毒-宿主关系的认识。在这项研究中,我们构建了 NCLDV 和真核生物分类群的共现网络,利用海洋测序数据预测病毒-宿主相互作用。使用正似然比评估宿主预测 NCLDV 的性能,我们对几种共现方法进行了基准测试,结果表明与随机宿主预测相比,预测真实阳性关系的优势比提高了 4 倍。为了进一步细化高维共现网络中的宿主预测,我们开发了一种基于系统发育信息的过滤方法 Taxon Interaction Mapper,并表明它进一步将预测性能提高了 12 倍。最后,我们推断了噬病毒-NCLDV 网络,以证实共现方法对于预测海洋环境中 NCLDV 的相互作用伙伴是有效的。NCLDV 可以感染广泛的真核生物,尽管与其他病毒相比,它们的生命周期对宿主的依赖性较小。然而,由于迄今为止仅分离出少数几种这些病毒,我们对 NCLDV-宿主系统的了解非常有限。共现信息被认为可用于预测病毒-宿主相互作用。在这项研究中,我们定量地展示了共现推断对 NCLDV 宿主预测的有效性。我们还通过基于系统发育的方法提高了预测性能,该方法可得到三个 NCLDV 科的候选宿主谱系的简明清单。我们的结果为共现方法在探索这组多样化病毒的生态方面的应用提供了依据。