Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China.
J Chem Phys. 2011 Feb 21;134(7):075103. doi: 10.1063/1.3519056.
Characterizing the conformations of protein in the transition state ensemble (TSE) is important for studying protein folding. A promising approach pioneered by Vendruscolo et al. [Nature (London) 409, 641 (2001)] to study TSE is to generate conformations that satisfy all constraints imposed by the experimentally measured φ values that provide information about the native likeness of the transition states. Faísca et al. [J. Chem. Phys. 129, 095108 (2008)] generated conformations of TSE based on the criterion that, starting from a TS conformation, the probabilities of folding and unfolding are about equal through Markov Chain Monte Carlo (MCMC) simulations. In this study, we use the technique of constrained sequential Monte Carlo method [Lin et al., J. Chem. Phys. 129, 094101 (2008); Zhang et al. Proteins 66, 61 (2007)] to generate TSE conformations of acylphosphatase of 98 residues that satisfy the φ-value constraints, as well as the criterion that each conformation has a folding probability of 0.5 by Monte Carlo simulations. We adopt a two stage process and first generate 5000 contact maps satisfying the φ-value constraints. Each contact map is then used to generate 1000 properly weighted conformations. After clustering similar conformations, we obtain a set of properly weighted samples of 4185 candidate clusters. Representative conformation of each of these cluster is then selected and 50 runs of Markov chain Monte Carlo (MCMC) simulation are carried using a regrowth move set. We then select a subset of 1501 conformations that have equal probabilities to fold and to unfold as the set of TSE. These 1501 samples characterize well the distribution of transition state ensemble conformations of acylphosphatase. Compared with previous studies, our approach can access much wider conformational space and can objectively generate conformations that satisfy the φ-value constraints and the criterion of 0.5 folding probability without bias. In contrast to previous studies, our results show that transition state conformations are very diverse and are far from nativelike when measured in cartesian root-mean-square deviation (cRMSD): the average cRMSD between TSE conformations and the native structure is 9.4 Å for this short protein, instead of 6 Å reported in previous studies. In addition, we found that the average fraction of native contacts in the TSE is 0.37, with enrichment in native-like β-sheets and a shortage of long range contacts, suggesting such contacts form at a later stage of folding. We further calculate the first passage time of folding of TSE conformations through calculation of physical time associated with the regrowth moves in MCMC simulation through mapping such moves to a Markovian state model, whose transition time was obtained by Langevin dynamics simulations. Our results indicate that despite the large structural diversity of the TSE, they are characterized by similar folding time. Our approach is general and can be used to study TSE in other macromolecules.
研究蛋白质折叠时,确定蛋白质在过渡态集合(TSE)中的构象很重要。Vendruscolo 等人[Nature (London) 409, 641 (2001)]开创了一种有前途的研究 TSE 的方法,即生成满足实验测量的φ值所施加的所有约束的构象,这些φ值提供了关于过渡态的天然类似物的信息。Faísca 等人[J. Chem. Phys. 129, 095108 (2008)]通过 Markov 链蒙特卡罗(MCMC)模拟生成 TSE 构象,其准则是,从一个 TS 构象开始,折叠和展开的概率大致相等。在这项研究中,我们使用受约束的顺序蒙特卡罗方法(Constrained Sequential Monte Carlo Method)[Lin 等人,J. Chem. Phys. 129, 094101 (2008);Zhang 等人,Proteins 66, 61 (2007)]生成满足φ值约束的酰基磷酸酶的 98 个残基的 TSE 构象,以及通过 Monte Carlo 模拟每个构象的折叠概率为 0.5 的准则。我们采用两阶段过程,首先生成 5000 个满足φ值约束的接触图。然后,每个接触图用于生成 1000 个适当加权的构象。在对相似构象进行聚类后,我们得到了一组由 4185 个候选簇组成的适当加权样本。然后选择每个簇的代表构象,并使用重新生长移动集对其进行 50 次 Markov 链蒙特卡罗(MCMC)模拟。然后,我们选择折叠和展开概率相等的 1501 个构象作为 TSE 集合。这 1501 个样本很好地描述了酰基磷酸酶 TSE 构象的分布。与之前的研究相比,我们的方法可以访问更广泛的构象空间,并且可以客观地生成满足φ值约束和 0.5 折叠概率准则的构象,而没有偏见。与之前的研究相比,我们的结果表明,过渡态构象非常多样化,当用笛卡尔均方根偏差(cRMSD)测量时,与天然结构相差甚远:对于这种短蛋白,TSE 构象与天然结构之间的平均 cRMSD 为 9.4 Å,而不是之前研究中报告的 6 Å。此外,我们发现 TSE 中的天然接触分数的平均值为 0.37,富含天然β-折叠,缺乏远程接触,这表明这些接触在折叠的后期形成。我们进一步通过将 MCMC 模拟中的重新生长移动映射到马尔可夫状态模型来计算 TSE 构象折叠的第一通过时间,通过计算与 MCMC 模拟中的重新生长移动相关的物理时间,我们通过 Langevin 动力学模拟获得了马尔可夫状态模型的转移时间。我们的结果表明,尽管 TSE 的结构多样性很大,但它们的折叠时间具有相似的特征。我们的方法是通用的,可以用于研究其他大分子的 TSE。