Suppr超能文献

单向原核生物重叠基因的起源与长度分布

Origin and length distribution of unidirectional prokaryotic overlapping genes.

作者信息

Fonseca Miguel M, Harris D James, Posada David

机构信息

Department of Biochemistry, Genetics and Immunology, University of Vigo, 36310 Vigo, Spain.

出版信息

G3 (Bethesda). 2014 Jan 10;4(1):19-27. doi: 10.1534/g3.113.005652.

Abstract

Prokaryotic unidirectional overlapping genes can be originated by disrupting and replacing of the start or stop codon of one protein-coding gene with another start or stop codon within the adjacent gene. However, the probability of disruption and replacement of a start or stop codon may differ significantly depending on the number and redundancy of the start and stop codons sets. Here, we performed a simulation study of the formation of unidirectional overlapping genes using a simple model of nucleotide change and contrasted it with empirical data. Our results suggest that overlaps originated by an elongation of the 3'-end of the upstream gene are significantly more frequent than those originated by an elongation of the 5'-end of the downstream gene. According to this, we propose a model for the creation of unidirectional overlaps that is based on the disruption probabilities of start codon and stop codon sets and on the different probabilities of phase 1 and phase 2 overlaps. Additionally, our results suggest that phase 2 overlaps are formed at higher rates than phase 1 overlaps, given the same evolutionary time. Finally, we propose that there is no need to invoke selection to explain the prevalence of long phase 1 unidirectional overlaps. Rather, the overrepresentation of long phase 1 relative to long phase 2 overlaps might occur because it is highly probable that phase 2 overlaps are retained as short overlaps by chance. Such a pattern is stronger if selection against very long overlaps is included in the model. Our model as a whole is able to explain to a large extent the empirical length distribution of unidirectional overlaps in prokaryotic genomes.

摘要

原核生物单向重叠基因可通过用相邻基因内的另一个起始或终止密码子破坏和替换一个蛋白质编码基因的起始或终止密码子而产生。然而,起始或终止密码子被破坏和替换的概率可能因起始和终止密码子集合的数量和冗余度而有显著差异。在此,我们使用一个简单的核苷酸变化模型对单向重叠基因的形成进行了模拟研究,并将其与实证数据进行了对比。我们的结果表明,由上游基因3'端延伸产生的重叠比由下游基因5'端延伸产生的重叠明显更频繁。据此,我们提出了一个基于起始密码子和终止密码子集合的破坏概率以及1期和2期重叠的不同概率来创建单向重叠的模型。此外,我们的结果表明,在相同的进化时间下,2期重叠的形成速率高于1期重叠。最后,我们提出无需引入选择来解释长1期单向重叠的普遍存在。相反,长1期重叠相对于长2期重叠的过度代表可能是因为2期重叠很可能偶然地以短重叠形式保留下来。如果在模型中纳入对非常长的重叠的选择,这种模式会更强。我们的整个模型能够在很大程度上解释原核生物基因组中单向重叠的实证长度分布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821f/3887535/14ecfc5145a9/19f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验