Department of Biochemistry, Cellular and Molecular Biology, The University of Tennessee, Knoxville, Tennessee 37996, USA.
RNA. 2012 Mar;18(3):368-84. doi: 10.1261/rna.031179.111. Epub 2012 Jan 11.
The sequence elements that mediate post-transcriptional gene regulation often reside in the 5' and 3' untranslated regions (UTRs) of mRNAs. Using six different families of dicotyledonous plants, we developed a comparative transcriptomics pipeline for the identification and annotation of deeply conserved regulatory sequences in the 5' and 3' UTRs. Our approach was robust to confounding effects of poor UTR alignability and rampant paralogy in plants. In the 3' UTR, motifs resembling PUMILIO-binding sites form a prominent group of conserved motifs. Additionally, Expansins, one of the few plant mRNA families known to be localized to specific subcellular sites, possess a core conserved RCCCGC motif. In the 5' UTR, one major subset of motifs consists of purine-rich repeats. A distinct and substantial fraction possesses upstream AUG start codons. Half of the AUG containing motifs reveal hidden protein-coding potential in the 5' UTR, while the other half point to a peptide-independent function related to translation. Among the former, we added four novel peptides to the small catalog of conserved-peptide uORFs. Among the latter, our case studies document patterns of uORF evolution that include gain and loss of uORFs, switches in uORF reading frame, and switches in uORF length and position. In summary, nearly three hundred post-transcriptional elements show evidence of purifying selection across the eudicot branch of flowering plants, indicating a regulatory function spanning at least 70 million years. Some of these sequences have experimental precedent, but many are novel and encourage further exploration.
介导转录后基因调控的序列元件通常位于 mRNA 的 5' 和 3' 非翻译区 (UTR)。我们使用六个不同的双子叶植物家族,开发了一种比较转录组学方法,用于鉴定和注释 5' 和 3' UTR 中深度保守的调控序列。我们的方法对 UTR 对齐不良和植物中广泛存在的旁系同源的混杂效应具有鲁棒性。在 3' UTR 中,类似于 PUMILIO 结合位点的基序形成了一组突出的保守基序。此外,Expansins 是少数已知定位于特定亚细胞部位的植物 mRNA 家族之一,具有核心保守的 RCCCGC 基序。在 5' UTR 中,一个主要的基序子集由嘌呤丰富的重复组成。一个独特而重要的部分具有上游 AUG 起始密码子。包含 AUG 的基序中有一半揭示了 5' UTR 中隐藏的蛋白质编码潜力,而另一半则指向与翻译相关的无肽依赖性功能。在前者中,我们在保守肽 uORF 的小目录中添加了四个新的肽。在后一种情况下,我们的案例研究记录了 uORF 进化的模式,包括 uORF 的获得和丢失、uORF 阅读框的转换以及 uORF 长度和位置的转换。总之,在开花植物的真双子叶分支中,近 300 个转录后元件显示出纯化选择的证据,表明跨越至少 7000 万年的调节功能。这些序列中有一些具有实验先例,但许多是新的,鼓励进一步探索。