Starmer J, Stomp A, Vouk M, Bitzer D
Bioinformatics Program, North Carolina State University, Raleigh, North Carolina, USA.
PLoS Comput Biol. 2006 May;2(5):e57. doi: 10.1371/journal.pcbi.0020057. Epub 2006 May 19.
In prokaryotes, Shine-Dalgarno (SD) sequences, nucleotides upstream from start codons on messenger RNAs (mRNAs) that are complementary to ribosomal RNA (rRNA), facilitate the initiation of protein synthesis. The location of SD sequences relative to start codons and the stability of the hybridization between the mRNA and the rRNA correlate with the rate of synthesis. Thus, accurate characterization of SD sequences enhances our understanding of how an organism's transcriptome relates to its cellular proteome. We implemented the Individual Nearest Neighbor Hydrogen Bond model for oligo-oligo hybridization and created a new metric, relative spacing (RS), to identify both the location and the hybridization potential of SD sequences by simulating the binding between mRNAs and single-stranded 16S rRNA 3' tails. In 18 prokaryote genomes, we identified 2,420 genes out of 58,550 where the strongest binding in the translation initiation region included the start codon, deviating from the expected location for the SD sequence of five to ten bases upstream. We designated these as RS+1 genes. Additional analysis uncovered an unusual bias of the start codon in that the majority of the RS+1 genes used GUG, not AUG. Furthermore, of the 624 RS+1 genes whose SD sequence was associated with a free energy release of less than -8.4 kcal/mol (strong RS+1 genes), 384 were within 12 nucleotides upstream of in-frame initiation codons. The most likely explanation for the unexpected location of the SD sequence for these 384 genes is mis-annotation of the start codon. In this way, the new RS metric provides an improved method for gene sequence annotation. The remaining strong RS+1 genes appear to have their SD sequences in an unexpected location that includes the start codon. Thus, our RS metric provides a new way to explore the role of rRNA-mRNA nucleotide hybridization in translation initiation.
在原核生物中,Shine-Dalgarno(SD)序列是信使核糖核酸(mRNA)起始密码子上游与核糖体核糖核酸(rRNA)互补的核苷酸序列,它有助于蛋白质合成的起始。SD序列相对于起始密码子的位置以及mRNA与rRNA之间杂交的稳定性与合成速率相关。因此,准确表征SD序列有助于我们理解生物体的转录组与其细胞蛋白质组之间的关系。我们实施了寡核苷酸-寡核苷酸杂交的个体最近邻氢键模型,并创建了一个新的指标——相对间距(RS),通过模拟mRNA与单链16S rRNA 3'尾之间的结合来识别SD序列的位置和杂交潜力。在18个原核生物基因组中,我们在58,550个基因中鉴定出2,420个基因,其翻译起始区域中最强的结合包含起始密码子,这与SD序列预期位于上游五到十个碱基的位置不同。我们将这些基因指定为RS + 1基因。进一步的分析发现起始密码子存在异常偏向,即大多数RS + 1基因使用GUG而非AUG。此外,在624个SD序列与小于-8.4千卡/摩尔的自由能释放相关的RS + 1基因(强RS + 1基因)中,384个位于读框内起始密码子上游12个核苷酸内。对于这384个基因的SD序列意外位置,最可能的解释是起始密码子的错误注释。通过这种方式,新的RS指标为基因序列注释提供了一种改进的方法。其余的强RS + 1基因似乎其SD序列位于包含起始密码子的意外位置。因此,我们的RS指标为探索rRNA - mRNA核苷酸杂交在翻译起始中的作用提供了一种新方法。