Li Rui Fang, Li Hong
School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China.
Gen Physiol Biophys. 2011 Jun;30(2):154-61. doi: 10.4149/gpb_2011_02_154.
It is currently believed that the protein folding rate is related to the protein structures and its amino acid sequence. However, few studies have been done on the problem that whether the protein folding rate is influenced by its corresponding mRNA sequence. In this paper, we analyzed the possible relationship between the protein folding rates and the corresponding mRNA sequences. The content of guanine and cytosine (GC content) of palindromes in protein coding sequence was introduced as a new parameter and added in the Gromiha's model of predicting protein folding rates to inspect its effect in protein folding process. The multiple linear regression analysis and jack-knife test show that the new parameter is significant. The linear correlation coefficient between the experimental and the predicted values of the protein folding rates increased significantly from 0.96 to 0.99, and the population variance decreased from 0.50 to 0.24 compared with Gromiha's results. The results show that the GC content of palindromes in the corresponding protein coding sequence really influences the protein folding rate. Further analysis indicates that this kind of effect mostly comes from the synonymous codon usage and from the information of palindrome structure itself, but not from the translation information from codons to amino acids.
目前认为蛋白质折叠速率与蛋白质结构及其氨基酸序列有关。然而,关于蛋白质折叠速率是否受其相应mRNA序列影响这一问题的研究却很少。在本文中,我们分析了蛋白质折叠速率与相应mRNA序列之间的可能关系。将蛋白质编码序列中回文序列的鸟嘌呤和胞嘧啶含量(GC含量)作为一个新参数引入到Gromiha预测蛋白质折叠速率的模型中,以考察其在蛋白质折叠过程中的作用。多元线性回归分析和留一法检验表明该新参数具有显著性。与Gromiha的结果相比,蛋白质折叠速率的实验值与预测值之间的线性相关系数从0.96显著提高到0.99,总体方差从0.50降至0.24。结果表明,相应蛋白质编码序列中回文序列的GC含量确实会影响蛋白质折叠速率。进一步分析表明,这种影响主要来自同义密码子的使用和回文结构本身的信息,而不是来自密码子到氨基酸的翻译信息。