MIT Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
Genome Res. 2011 Dec;21(12):2096-113. doi: 10.1101/gr.119974.110. Epub 2011 Oct 12.
While translational stop codon readthrough is often used by viral genomes, it has been observed for only a handful of eukaryotic genes. We previously used comparative genomics evidence to recognize protein-coding regions in 12 species of Drosophila and showed that for 149 genes, the open reading frame following the stop codon has a protein-coding conservation signature, hinting that stop codon readthrough might be common in Drosophila. We return to this observation armed with deep RNA sequence data from the modENCODE project, an improved higher-resolution comparative genomics metric for detecting protein-coding regions, comparative sequence information from additional species, and directed experimental evidence. We report an expanded set of 283 readthrough candidates, including 16 double-readthrough candidates; these were manually curated to rule out alternatives such as A-to-I editing, alternative splicing, dicistronic translation, and selenocysteine incorporation. We report experimental evidence of translation using GFP tagging and mass spectrometry for several readthrough regions. We find that the set of readthrough candidates differs from other genes in length, composition, conservation, stop codon context, and in some cases, conserved stem-loops, providing clues about readthrough regulation and potential mechanisms. Lastly, we expand our studies beyond Drosophila and find evidence of abundant readthrough in several other insect species and one crustacean, and several readthrough candidates in nematode and human, suggesting that functionally important translational stop codon readthrough is significantly more prevalent in Metazoa than previously recognized.
虽然翻译终止密码子通读经常被病毒基因组使用,但在真核生物基因中仅观察到少数几个。我们之前使用比较基因组学证据在 12 种果蝇中识别出蛋白质编码区域,并表明在 149 个基因中,紧随终止密码子的开放阅读框具有蛋白质编码保守特征,这表明终止密码子通读可能在果蝇中很常见。我们利用 modENCODE 项目的深度 RNA 序列数据、用于检测蛋白质编码区域的改进的更高分辨率比较基因组学指标、来自其他物种的比较序列信息以及定向实验证据,重新审视了这一观察结果。我们报告了一组扩展的 283 个通读候选基因,包括 16 个双通读候选基因;这些候选基因经过人工精心筛选,以排除其他可能性,如 A-to-I 编辑、可变剪接、双顺反子翻译和硒代半胱氨酸掺入。我们报告了使用 GFP 标记和质谱法对几个通读区域进行翻译的实验证据。我们发现,通读候选基因在长度、组成、保守性、终止密码子上下文以及在某些情况下保守的茎环结构方面与其他基因不同,这为通读调节和潜在机制提供了线索。最后,我们将研究扩展到果蝇之外,并在其他几种昆虫和一种甲壳类动物中发现了丰富的通读证据,以及在线虫和人类中发现了几个通读候选基因,这表明功能重要的翻译终止密码子通读在 Metazoa 中的普遍性比以前认为的要高得多。