Neafsey Daniel E, Galagan James E
Microbial Analysis Group, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
Mol Biol Evol. 2007 Aug;24(8):1744-51. doi: 10.1093/molbev/msm093. Epub 2007 May 9.
Upstream open reading frames (uORFs) are common features of eukaryotic genes, occurring in 10%-25% of 5' leader sequences. Upstream ORFs that have been subjected to experimental analysis have been generally found to decrease translational efficiency of the downstream coding sequence. Previous investigations of uORFs in mammals and yeast have detected uORFs conserved over long evolutionary distances, prompting speculation about the nature and cause of the natural selection underlying such conservation. We have analyzed uORFs in the basidiomycetous fungal pathogen Cryptococcus neoformans to discern the properties of this purifying selection. We find that uORFs in the Cryptococcus species complex are conserved at twice the expected rate, and we report 122 uORFs that are conserved among all four sequenced Cryptococcus strains. A significantly greater proportion of uORF losses occur via direct mutation to the uORF start codon than expected. This observation suggests that mutational disruption of a uORF that leaves the start codon intact may be selectively disadvantageous, perhaps because of the risk of premature translation initiation. Accounting for this constrained mode of loss and comparing the relative conservation of uORFs between the 5' leader and control sequences enables us to calculate that at least a third of uORFs may be conserved for their effects on translational efficiency. The remaining fraction may be conserved either by chance or as a result of selective pressure to prevent premature translation initiation from the uORF start codon. We find that the majority of conserved uORFs do not exhibit codon usage bias or conservation at the amino acid level, and therefore they do not likely encode bioactive peptides. Our analysis suggests that uORFs are an important and underappreciated mechanism of post-transcriptional gene regulation in eukaryotes.
上游开放阅读框(uORFs)是真核基因的常见特征,存在于10%-25%的5'前导序列中。经过实验分析的上游开放阅读框通常被发现会降低下游编码序列的翻译效率。此前对哺乳动物和酵母中uORFs的研究已经检测到在漫长进化距离上保守的uORFs,这引发了对这种保守背后自然选择的性质和原因的猜测。我们分析了担子菌真菌病原体新型隐球菌中的uORFs,以了解这种纯化选择的特性。我们发现隐球菌物种复合体中的uORFs以预期速率两倍的速度保守,并且我们报告了在所有四个已测序的隐球菌菌株中都保守的122个uORFs。通过直接突变为uORF起始密码子而发生的uORF丢失比例显著高于预期。这一观察结果表明,uORF的突变破坏而起始密码子保持完整可能在选择上是不利的,也许是因为存在过早翻译起始的风险。考虑到这种受限的丢失模式,并比较5'前导序列和对照序列之间uORFs的相对保守性,使我们能够计算出至少三分之一的uORFs可能因其对翻译效率的影响而保守。其余部分可能是偶然保守的,或者是由于防止从uORF起始密码子过早翻译起始的选择压力而保守的。我们发现大多数保守的uORFs在密码子使用上没有偏差,在氨基酸水平上也没有保守性,因此它们不太可能编码生物活性肽。我们的分析表明,uORFs是真核生物转录后基因调控的一种重要但未被充分认识的机制。