Doyle Frank, Leonardi Andrea, Endres Lauren, Tenenbaum Scott A, Dedon Peter C, Begley Thomas J
State University of New York - SUNY Polytechnic Institute, College of Nanoscale Science and Engineering, Albany, NY.
State University of New York - SUNY Polytechnic Institute, College of Arts and Sciences, Utica, NY.
Methods. 2016 Sep 1;107:98-109. doi: 10.1016/j.ymeth.2016.05.010. Epub 2016 May 28.
The translation of mRNA in all forms of life uses a three-nucleotide codon and aminoacyl-tRNAs to synthesize a protein. There are 64 possible codons in the genetic code, with codons for the ∼20 amino acids and 3 stop codons having 1- to 6-fold degeneracy. Recent studies have shown that families of stress response transcripts, termed modification tunable transcripts (MoTTs), use distinct codon biases that match specifically modified tRNAs to regulate their translation during a stress. Similarly, translational reprogramming of the UGA stop codon to generate selenoproteins or to perform programmed translational read-through (PTR) that results in a longer protein, requires distinct codon bias (i.e., more than one stop codon) and, in the case of selenoproteins, a specifically modified tRNA. In an effort to identify transcripts that have codon usage patterns that could be subject to translational control mechanisms, we have used existing genome and transcript data to develop the gene-specific Codon UTilization (CUT) tool and database, which details all 1-, 2-, 3-, 4- and 5-codon combinations for all genes or transcripts in yeast (Saccharomyces cerevisiae), mice (Mus musculus) and rats (Rattus norvegicus). Here, we describe the use of the CUT tool and database to characterize significant codon usage patterns in specific genes and groups of genes. In yeast, we demonstrate how the CUT database can be used to identify genes that have runs of specific codons (e.g., AGA, GAA, AAG) linked to translational regulation by tRNA methyltransferase 9 (Trm9). We further demonstrate how groups of genes can be analyzed to find significant dicodon patterns, with the 80 Gcn4-regulated transcripts significantly (P<0.00001) over-represented with the AGA-GAA dicodon. We have also used the CUT database to identify mouse and rat transcripts with internal UGA codons, with the surprising finding of 45 and 120 such transcripts, respectively, which is much larger than expected. The UGA data suggest that there could be many more translationally reprogrammed transcripts than currently reported. CUT thus represents a multi-species codon-counting database that can be used with mRNA-, translation- and proteomics-based results to better understand and model translational control mechanisms.
在所有生命形式中,mRNA的翻译都使用三核苷酸密码子和氨酰tRNA来合成蛋白质。遗传密码中有64种可能的密码子,其中约20种氨基酸的密码子和3种终止密码子具有1至6倍的简并性。最近的研究表明,应激反应转录本家族,即修饰可调转录本(MoTTs),使用独特的密码子偏好,这些偏好与经过特定修饰的tRNA相匹配,以在应激期间调节其翻译。同样,将UGA终止密码子进行翻译重编程以生成硒蛋白或进行程序性翻译通读(PTR)从而产生更长的蛋白质,需要独特的密码子偏好(即不止一个终止密码子),并且对于硒蛋白而言,还需要一种经过特定修饰的tRNA。为了识别那些密码子使用模式可能受到翻译控制机制影响的转录本,我们利用现有的基因组和转录本数据开发了基因特异性密码子利用(CUT)工具和数据库,该工具和数据库详细列出了酵母(酿酒酵母)、小鼠(小家鼠)和大鼠(褐家鼠)中所有基因或转录本的所有1、2、3、4和5密码子组合。在这里,我们描述了如何使用CUT工具和数据库来表征特定基因和基因组中显著的密码子使用模式。在酵母中,我们展示了如何使用CUT数据库来识别那些具有与tRNA甲基转移酶9(Trm9)介导的翻译调控相关的特定密码子连续序列(例如AGA、GAA、AAG)的基因。我们进一步展示了如何分析基因组以找到显著的双密码子模式,80个受Gcn4调控的转录本中,AGA - GAA双密码子显著(P<0.00001)过度富集。我们还使用CUT数据库识别了具有内部UGA密码子的小鼠和大鼠转录本,令人惊讶的是,分别发现了45个和120个这样的转录本,这比预期的要多得多。UGA数据表明,翻译重编程的转录本可能比目前报道的要多得多。因此,CUT代表了一个多物种密码子计数数据库,可与基于mRNA、翻译和蛋白质组学的结果一起使用,以更好地理解和建模翻译控制机制。