Department of Biology, Utah State University, Logan, Utah, USA.
PLoS One. 2011 Feb 3;6(2):e14651. doi: 10.1371/journal.pone.0014651.
The study of large-scale genome structure has revealed patterns suggesting the influence of evolutionary constraints on genome evolution. However, the results of these studies can be difficult to interpret due to the conceptual complexity of the analyses. This makes it difficult to understand how observed statistical patterns relate to the physical distribution of genomic elements. We use a simpler and more intuitive approach to evaluate patterns of genome structure.
METHODOLOGY/PRINCIPAL FINDINGS: We used randomization tests based on Morisita's Index of aggregation to examine average differences in the distribution of purines and pyrimidines among coding and noncoding regions of 261 chromosomes from 223 microbial genomes representing 21 phylum level groups. Purines and pyrimidines were aggregated in the noncoding DNA of 86% of genomes, but were only aggregated in the coding regions of 52% of genomes. Coding and noncoding DNA differed in aggregation in 94% of genomes. Noncoding regions were more aggregated than coding regions in 91% of these genomes. Genome length appears to limit aggregation, but chromosome length does not. Chromosomes from the same species are similarly aggregated despite substantial differences in length. Aggregation differed among taxonomic groups, revealing support for a previously reported pattern relating genome structure to environmental conditions.
CONCLUSIONS/SIGNIFICANCE: Our approach revealed several patterns of genome structure among different types of DNA, different chromosomes of the same genome, and among different taxonomic groups. Similarity in aggregation among chromosomes of varying length from the same genome suggests that individual chromosome structure has not evolved independently of the general constraints on genome structure as a whole. These patterns were detected using simple and readily interpretable methods commonly used in other areas of biology.
大规模基因组结构的研究揭示了进化约束对基因组进化的影响模式。然而,由于分析的概念复杂性,这些研究的结果可能难以解释。这使得难以理解观察到的统计模式与基因组元件的物理分布之间的关系。我们使用更简单和直观的方法来评估基因组结构的模式。
方法/主要发现:我们使用基于 Morisita 聚集指数的随机化检验,检查了来自代表 21 个门水平组的 223 个微生物基因组的 261 条染色体的编码和非编码区域中嘌呤和嘧啶分布的平均差异。在 86%的基因组中,嘌呤和嘧啶在非编码 DNA 中聚集,但在 52%的基因组中仅在编码区域中聚集。在 94%的基因组中,编码和非编码 DNA 的聚集方式不同。在 91%的这些基因组中,非编码区域比编码区域更聚集。基因组长度似乎限制了聚集,但染色体长度没有。尽管长度存在很大差异,但来自同一物种的染色体仍然聚集在一起。聚集方式在不同的分类群中存在差异,这为先前报道的与环境条件相关的基因组结构模式提供了支持。
结论/意义:我们的方法揭示了不同类型 DNA、同一基因组的不同染色体以及不同分类群之间的几种基因组结构模式。来自同一基因组的不同长度的染色体之间的聚集相似性表明,单个染色体结构的进化并非独立于整个基因组结构的一般约束。这些模式是使用生物学其他领域常用的简单且易于解释的方法检测到的。