Department of Statistics, Oxford University, 1 South Parks Road, Oxford OX1 3TG, UK.
Nucleic Acids Res. 2010 Oct;38(19):6719-28. doi: 10.1093/nar/gkq495. Epub 2010 Jun 8.
Translation of mRNA into protein is a unidirectional information flow process. Analysing the input (mRNA) and output (protein) of translation, we find that local protein structure information is encoded in the mRNA nucleotide sequence. The Coding Sequence and Structure (CSandS) database developed in this work provides a detailed mapping between over 4000 solved protein structures and their mRNA. CSandS facilitates a comprehensive analysis of codon usage over many organisms. In assigning translation speed, we find that relative codon usage is less informative than tRNA concentration. For all speed measures, no evidence was found that domain boundaries are enriched with slow codons. In fact, genes seemingly avoid slow codons around structurally defined domain boundaries. Translation speed, however, does decrease at the transition into secondary structure. Codons are identified that have structural preferences significantly different from the amino acid they encode. However, each organism has its own set of 'significant codons'. Our results support the premise that codons encode more information than merely amino acids and give insight into the role of translation in protein folding.
mRNA 翻译成蛋白质是一个单向信息流过程。通过分析翻译的输入(mRNA)和输出(蛋白质),我们发现局部蛋白质结构信息被编码在 mRNA 核苷酸序列中。本工作开发的编码序列和结构 (CSandS) 数据库提供了超过 4000 个已解决蛋白质结构与其 mRNA 之间的详细映射。CSandS 促进了对许多生物体中密码子使用的全面分析。在分配翻译速度时,我们发现相对密码子使用频率不如 tRNA 浓度有信息。对于所有的速度衡量标准,都没有证据表明结构域边界富含慢速密码子。事实上,基因似乎在结构上定义的域边界周围避免了慢速密码子。然而,在进入二级结构时,翻译速度确实会下降。确定了一些密码子,它们的结构偏好与它们编码的氨基酸明显不同。然而,每个生物体都有自己的“重要密码子”集。我们的结果支持密码子编码的信息不仅仅是氨基酸的前提,并深入了解翻译在蛋白质折叠中的作用。