Kingsford Carl, Delcher Arthur L, Salzberg Steven L
Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, USA.
Mol Biol Evol. 2007 Sep;24(9):2091-8. doi: 10.1093/molbev/msm145. Epub 2007 Jul 21.
Overlapping genes are a common phenomenon. Among sequenced prokaryotes, more than 29% of all annotated genes overlap at least 1 of their 2 flanking genes. We present a unified model for the creation and repair of overlaps among adjacent genes where the 3' ends either overlap or nearly overlap. Our model, derived from a comprehensive analysis of complete prokaryotic genomes in GenBank, explains the nonuniform distribution of the lengths of such overlap regions far more simply than previously proposed models. Specifically, we explain the distribution of overlap lengths based on random extensions of genes to the next occurring downstream stop codon. Our model also provides an explanation for a newly observed (here) pattern in the distribution of the separation distances of closely spaced nonoverlapping genes. We provide evidence that the newly described biased distribution of separation distances is driven by the same phenomenon that creates the uneven distribution of overlap lengths. This suggests a dynamic picture of continual overlap creation and elimination.
重叠基因是一种常见现象。在已测序的原核生物中,超过29%的所有注释基因与其两侧的至少一个基因存在重叠。我们提出了一个统一的模型,用于解释相邻基因之间重叠的产生和修复,这些基因的3'端要么重叠,要么几乎重叠。我们的模型源自对GenBank中完整原核生物基因组的全面分析,比之前提出的模型更简单地解释了此类重叠区域长度的非均匀分布。具体而言,我们基于基因向接下来出现的下游终止密码子的随机延伸来解释重叠长度的分布。我们的模型还为新观察到的(在此处)紧密间隔的非重叠基因间隔距离分布模式提供了解释。我们提供的证据表明,新描述的间隔距离偏差分布是由产生重叠长度不均匀分布的相同现象驱动的。这暗示了一个持续的重叠产生和消除的动态图景。