Iranzo Jaime, Krupovic Mart, Koonin Eugene V
National Center for Biotechnology Information, National Library of Medicine, Bethesda, Maryland, USA.
Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France.
mBio. 2016 Aug 2;7(4):e00978-16. doi: 10.1128/mBio.00978-16.
Virus genomes are prone to extensive gene loss, gain, and exchange and share no universal genes. Therefore, in a broad-scale study of virus evolution, gene and genome network analyses can complement traditional phylogenetics. We performed an exhaustive comparative analysis of the genomes of double-stranded DNA (dsDNA) viruses by using the bipartite network approach and found a robust hierarchical modularity in the dsDNA virosphere. Bipartite networks consist of two classes of nodes, with nodes in one class, in this case genomes, being connected via nodes of the second class, in this case genes. Such a network can be partitioned into modules that combine nodes from both classes. The bipartite network of dsDNA viruses includes 19 modules that form 5 major and 3 minor supermodules. Of these modules, 11 include tailed bacteriophages, reflecting the diversity of this largest group of viruses. The module analysis quantitatively validates and refines previously proposed nontrivial evolutionary relationships. An expansive supermodule combines the large and giant viruses of the putative order "Megavirales" with diverse moderate-sized viruses and related mobile elements. All viruses in this supermodule share a distinct morphogenetic tool kit with a double jelly roll major capsid protein. Herpesviruses and tailed bacteriophages comprise another supermodule, held together by a distinct set of morphogenetic proteins centered on the HK97-like major capsid protein. Together, these two supermodules cover the great majority of currently known dsDNA viruses. We formally identify a set of 14 viral hallmark genes that comprise the hubs of the network and account for most of the intermodule connections.
Viruses and related mobile genetic elements are the dominant biological entities on earth, but their evolution is not sufficiently understood and their classification is not adequately developed. The key reason is the characteristic high rate of virus evolution that involves not only sequence change but also extensive gene loss, gain, and exchange. Therefore, in the study of virus evolution on a large scale, traditional phylogenetic approaches have limited applicability and have to be complemented by gene and genome network analyses. We applied state-of-the art methods of such analysis to reveal robust hierarchical modularity in the genomes of double-stranded DNA viruses. Some of the identified modules combine highly diverse viruses infecting bacteria, archaea, and eukaryotes, in support of previous hypotheses on direct evolutionary relationships between viruses from the three domains of cellular life. We formally identify a set of 14 viral hallmark genes that hold together the genomic network.
病毒基因组易于发生广泛的基因丢失、获得和交换,且不存在通用基因。因此,在病毒进化的大规模研究中,基因和基因组网络分析可以补充传统的系统发育学。我们使用二分网络方法对双链DNA(dsDNA)病毒的基因组进行了详尽的比较分析,发现在dsDNA病毒圈中存在强大的层次模块化结构。二分网络由两类节点组成,在这种情况下,一类节点是基因组,通过另一类节点(即基因)相连。这样的网络可以被划分为结合了两类节点的模块。dsDNA病毒的二分网络包括19个模块,这些模块形成了5个主要超模块和3个次要超模块。在这些模块中,11个包含有尾噬菌体,这反映了这一最大病毒群体的多样性。模块分析定量验证并细化了先前提出的非平凡进化关系。一个扩展的超模块将假定的“巨型病毒目”的大型和巨型病毒与各种中等大小的病毒及相关移动元件组合在一起。这个超模块中的所有病毒都共享一个独特的形态发生工具包,其主要衣壳蛋白为双果冻卷结构。疱疹病毒和有尾噬菌体组成了另一个超模块,由一组以HK97样主要衣壳蛋白为中心的独特形态发生蛋白维系在一起。这两个超模块共同涵盖了目前已知的绝大多数dsDNA病毒。我们正式确定了一组14个病毒标志性基因,它们构成了网络的枢纽,并解释了大多数模块间的连接。
病毒及相关移动遗传元件是地球上占主导地位的生物实体,但它们的进化尚未得到充分理解,其分类也不够完善。关键原因是病毒进化的特征性高速度,这不仅涉及序列变化,还包括广泛的基因丢失、获得和交换。因此,在大规模病毒进化研究中,传统的系统发育方法适用性有限,必须辅以基因和基因组网络分析。我们应用了此类分析的先进方法,以揭示双链DNA病毒基因组中强大的层次模块化结构。一些已确定的模块组合了感染细菌、古菌和真核生物的高度多样化病毒,支持了先前关于来自细胞生命三个域的病毒之间直接进化关系的假设。我们正式确定了一组14个病毒标志性基因,它们维系着基因组网络。