Adiguzel Y
Biophysics Department, School of Medicine, Istanbul Kemerburgaz University, Istanbul, Turkey.
Biosystems. 2017 Sep;159:1-11. doi: 10.1016/j.biosystems.2017.05.003. Epub 2017 Jun 15.
Based on the Shannon's information communication theory, information amount of the entire length of a polymeric macromolecule can be calculated in bits through adding the entropies of each building block. Proteins, DNA and RNA are such macromolecules. When only the building blocks' variation is considered as the source of entropy, there is seemingly lower information in case of the protein if this approach is applied directly on a protein of specific size and the coding sequence size of the mRNA corresponding to the particular length of the protein. This decrease in the information amount seems contradictory but this apparent conflict is resolved by considering the conformational variations in proteins as a new variable in the calculation and balancing the approximated entropy of the coding part of the mRNA and the protein. Probabilities can change therefore we also assigned hypothetical probabilities to the conformational states, which represent the uneven distribution as the time spent in one conformation, providing the probability of the presence in either or one of the possible conformations. Results that are obtained by using hypothetical probabilities are in line with the experimental values of variations in the conformational-state of protein populations. This equalization approach has further biological relevance that it compensates for the degeneracy in the codon usage during protein translation and it leads to the conclusion that the alphabet size for the protein is rather optimal for the proper protein functioning within the thermodynamic milieu of the cell. The findings were also discussed in relation to the codon bias and have implications in relation to the codon evolution concept. Eventually, this work brings the fields of protein structural studies and molecular protein translation processes together with a novel approach.
基于香农信息通信理论,可以通过累加每个构建模块的熵,以比特为单位计算聚合物大分子全长的信息量。蛋白质、DNA和RNA就是这样的大分子。当仅将构建模块的变异视为熵的来源时,如果将此方法直接应用于特定大小的蛋白质以及与该特定长度蛋白质相对应的mRNA的编码序列大小,那么蛋白质的信息量似乎较低。信息量的这种减少看似矛盾,但通过将蛋白质的构象变异作为计算中的一个新变量,并平衡mRNA编码部分和蛋白质的近似熵,这种明显的冲突得以解决。概率可能会发生变化,因此我们还为构象状态分配了假设概率,这些概率表示在一种构象中花费的时间的不均匀分布,从而提供了处于任一或其中一种可能构象的概率。使用假设概率获得的结果与蛋白质群体构象状态变异的实验值一致。这种均衡方法具有进一步的生物学意义,即它补偿了蛋白质翻译过程中密码子使用的简并性,并得出结论:蛋白质的字母表大小对于蛋白质在细胞热力学环境中的正常功能而言相当优化。研究结果还结合密码子偏好进行了讨论,并对密码子进化概念具有启示意义。最终,这项工作以一种新颖的方法将蛋白质结构研究领域和分子蛋白质翻译过程结合在一起。