National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Proc Natl Acad Sci U S A. 2010 Feb 16;107(7):2983-8. doi: 10.1073/pnas.0910445107. Epub 2010 Jan 26.
The hypothesis that folding robustness is the primary determinant of the evolution rate of proteins is explored using a coarse-grained off-lattice model. The simplicity of the model allows rapid computation of the folding probability of a sequence to any folded conformation. For each robust folder, the network of sequences that share its native structure is identified. The fitness of a sequence is postulated to be a simple function of the number of misfolded molecules that have to be produced to reach a characteristic protein abundance. After fixation probabilities of mutants are computed under a simple population dynamics model, a Markov chain on the fold network is constructed, and the fold-averaged evolution rate is computed. The distribution of the logarithm of the evolution rates across distinct networks exhibits a peak with a long tail on the low rate side and resembles the universal empirical distribution of the evolutionary rates more closely than either distribution resembles the log-normal distribution. The results suggest that the universal distribution of the evolutionary rates of protein-coding genes is a direct consequence of the basic physics of protein folding.
该假说认为折叠稳健性是蛋白质进化速率的主要决定因素,本研究使用了一种粗粒无网格模型对此假说进行了探索。该模型的简单性允许快速计算序列到任何折叠构象的折叠概率。对于每个稳健的折叠体,确定了共享其天然结构的序列网络。假设序列的适合度是一个简单的函数,该函数与达到特征蛋白丰度所需产生的错误折叠分子的数量有关。在简单的群体动态模型下计算突变体的固定概率后,在折叠网络上构建一个马尔可夫链,并计算折叠平均进化率。不同网络的进化率对数分布表现出一个峰值,在低速率侧有一个长尾巴,与通用经验进化率分布的相似性超过了与对数正态分布的相似性。结果表明,蛋白质编码基因的进化率的通用分布是蛋白质折叠基本物理的直接结果。