Newcastle University Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, NE2 4HH, UK.
Present address: Institute of Biotechnology, Helsinki Institute of Life Sciences (HiLIFE), University of Helsinki, Viikki Biocenter 2, Helsinki 00014, Finland.
Microb Genom. 2021 Sep;7(9). doi: 10.1099/mgen.0.000649.
The nucleocytoplasmic large DNA viruses (NCLDVs) are a diverse group that currently contain the largest known virions and genomes, also called giant viruses. The first giant virus was isolated and described nearly 20 years ago. Their genome sizes were larger than for any other known virus at the time and it contained a number of genes that had not been previously described in any virus. The origin and evolution of these unusually complex viruses has been puzzling, and various mechanisms have been put forward to explain how some NCLDVs could have reached genome sizes and coding capacity overlapping with those of cellular microbes. Here we critically discuss the evidence and arguments on this topic. We have also updated and systematically reanalysed protein families of the NCLDVs to further study their origin and evolution. Our analyses further highlight the small number of widely shared genes and extreme genomic plasticity among NCLDVs that are shaped via combinations of gene duplications, deletions, lateral gene transfers and creation of protein-coding genes. The dramatic expansions of the genome size and protein-coding gene capacity characteristic of some NCLDVs is now increasingly understood to be driven by environmental factors rather than reflecting relationships to an ancient common ancestor among a hypothetical cellular lineage. Thus, the evolution of NCLDVs is writ large viral, and their origin, like all other viral lineages, remains unknown.
核质大 DNA 病毒(NCLDV)是一个多样化的群体,目前包含已知最大的病毒粒子和基因组,也被称为巨型病毒。第一个巨型病毒是在近 20 年前被分离和描述的。它们的基因组大小比当时任何已知的病毒都要大,并且包含了许多以前在任何病毒中都没有描述过的基因。这些异常复杂的病毒的起源和进化一直令人困惑,人们提出了各种机制来解释为什么一些 NCLDV 能够达到与细胞微生物重叠的基因组大小和编码能力。在这里,我们批判性地讨论了这个主题的证据和论点。我们还更新并系统地重新分析了 NCLDV 的蛋白质家族,以进一步研究它们的起源和进化。我们的分析进一步强调了 NCLDV 之间广泛共享基因的数量很少,以及通过基因重复、缺失、横向基因转移和创造蛋白质编码基因的组合来塑造的极端基因组可塑性。一些 NCLDV 的基因组大小和蛋白质编码基因容量的巨大扩张现在越来越被理解为是由环境因素驱动的,而不是反映了它们与一个假设的细胞谱系中古老的共同祖先之间的关系。因此,NCLDV 的进化是明显的病毒进化,它们的起源,就像所有其他病毒谱系一样,仍然未知。