60 株分枝杆菌噬菌体基因组的比较基因组分析:基因组聚类、基因获得和基因大小。

Comparative genomic analysis of 60 Mycobacteriophage genomes: genome clustering, gene acquisition, and gene size.

机构信息

Department of Biological Sciences, Pittsburgh Bacteriophage Institute, Pittsburgh, PA 15260, USA.

出版信息

J Mol Biol. 2010 Mar 19;397(1):119-43. doi: 10.1016/j.jmb.2010.01.011. Epub 2010 Jan 11.

Abstract

Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of 60-all infecting a common bacterial host-provides further insight into their diversity and evolution. Of the 60 phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, 5 of which can be further divided into subclusters; 5 genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the 6 genomes in Cluster D share more than 97.5% average nucleotide similarity with one another. In contrast, similarity between the 2 genomes in Cluster I is barely detectable by diagonal plot analysis. In total, 6858 predicted open-reading frames have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries, and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit a smaller average size than genes of their host (205 residues compared with 315), phage genes in higher flux average only 100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains.

摘要

分枝杆菌噬菌体是感染分枝杆菌宿主的病毒。测序噬菌体基因组的集合扩展到总共 60 个-全部感染共同的细菌宿主-提供了更多关于它们的多样性和进化的见解。在 60 个噬菌体基因组中,根据核苷酸序列相似性,可以将 55 个分为 9 个聚类,其中 5 个可以进一步分为亚聚类;5 个基因组与其他噬菌体不聚类。聚类内基因组之间的序列差异很大;例如,D 聚类中的 6 个基因组彼此之间共享超过 97.5%的平均核苷酸相似性。相比之下,I 聚类中的 2 个基因组之间的相似性几乎无法通过对角线图分析检测到。总共预测了 6858 个开放阅读框,分为 1523 个相关序列的 phamilies(phams),其中 46%的 phams 只有一个成员。只有 18.8%的 phams 与非分枝杆菌噬菌体数据库条目具有序列相似性,并且少于 10%的所有 phams可以根据数据库搜索或同线性分配功能。基因组聚类有助于识别处于最大遗传流中的基因,并且更有可能在相对较近的进化时间内水平交换。尽管分枝杆菌噬菌体基因的平均大小小于其宿主(205 个残基与 315 个相比)的基因,但高流量的噬菌体基因平均只有 100 个氨基酸,这表明遗传交换的主要单位对应于单个蛋白质结构域。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索