Hayden Celine A, Bosco Giovanni
Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ 85721, USA.
BMC Genomics. 2008 Feb 1;9:61. doi: 10.1186/1471-2164-9-61.
Upstream open reading frames (uORFs) are elements found in the 5'-region of an mRNA transcript, capable of regulating protein production of the largest, or major ORF (mORF), and impacting organismal development and growth in fungi, plants, and animals. In Drosophila, approximately 40% of transcripts contain upstream start codons (uAUGs) but there is little evidence that these are translated and affect their associated mORF.
Analyzing 19,389 Drosophila melanogaster transcript annotations and 666,153 dipteran EST sequences we have identified 44 putative conserved peptide uORFs (CPuORFs) in Drosophila melanogaster that show evidence of negative selection, and therefore are likely to be translated. Transcripts with CPuORFs constitute approximately 0.3% of the total number of transcripts, a similar frequency to the Arabidopsis genome, and have a mean length of 70 amino acids, much larger than the mean length of plant CPuORFs (40 amino acids). There is a statistically significant clustering of CPuORFs at cytological band 57 (p = 10-5), a phenomenon that has never been described for uORFs. Based on GO term and Interpro domain analyses, genes in the uORF dataset show a higher frequency of ORFs implicated in mitochondrial import than the genome-wide frequency (p < 0.01) as well as methyltransferases (p < 0.02).
Based on these data, it is clear that Drosophila contain putative CPuORFs at frequencies similar to those found in plants. They are distinguished, however, by the type of mORF they tend to associate with, Drosophila CPuORFs preferentially occurring in transcripts encoding mitochondrial proteins and methyltransferases. This provides a basis for the study of CPuORFs and their putative regulatory role in mitochondrial function and disease.
上游开放阅读框(uORF)是在mRNA转录本5'区域发现的元件,能够调节最大或主要开放阅读框(mORF)的蛋白质产生,并影响真菌、植物和动物的机体发育与生长。在果蝇中,约40%的转录本含有上游起始密码子(uAUG),但几乎没有证据表明这些密码子会被翻译并影响其相关的mORF。
通过分析19389条黑腹果蝇转录本注释和666153条双翅目EST序列,我们在黑腹果蝇中鉴定出44个推定的保守肽uORF(CPuORF),这些uORF显示出负选择的证据,因此可能被翻译。含有CPuORF的转录本约占转录本总数的0.3%,这一频率与拟南芥基因组相似,其平均长度为70个氨基酸,远大于植物CPuORF的平均长度(40个氨基酸)。CPuORF在细胞学带57处存在统计学上显著的聚集(p = 10^-5),这是一种从未在uORF中描述过的现象。基于基因本体(GO)术语和Interpro结构域分析,uORF数据集中的基因与线粒体导入相关的开放阅读框频率高于全基因组频率(p < 0.01),以及甲基转移酶(p < 0.02)。
基于这些数据,很明显果蝇中推定的CPuORF频率与植物中的相似。然而,它们的区别在于它们倾向于与之关联的mORF类型,果蝇CPuORF优先出现在编码线粒体蛋白和甲基转移酶的转录本中。这为研究CPuORF及其在 mitochondrial功能和疾病中的推定调节作用提供了基础。 (注:原文中“mitochondrial”翻译为“线粒体的”,但最后一句中“mitochondrial function”直译为“线粒体功能”更符合语境,这里可能存在一处表述不一致,按照要求未作修改。)