Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States.
Biochemistry. 2024 Nov 19;63(22):3045-3056. doi: 10.1021/acs.biochem.4c00573. Epub 2024 Nov 7.
Markov State Models (MSMs) have been widely applied to understand protein folding mechanisms by predicting long time scale dynamics from ensembles of short molecular simulations. Most MSM estimators enforce detailed balance, assuming that trajectory data are sampled at an equilibrium. This is rarely the case for ab initio folding studies, however, and as a result, MSMs can severely underestimate protein folding stabilities from such data. To remedy this problem, we have developed an enhanced-sampling protocol in which (1) unbiased folding simulations are performed and sparse tICA is used to obtain features that best capture the slowest events in folding, (2) umbrella sampling along this reaction coordinate is performed to observe folding and unfolding transitions, and (3) the thermodynamics and kinetics of folding are estimated using multiensemble Markov models (MEMMs). Using this protocol, folding pathways, rates, and stabilities of a designed α-helical hairpin, Z34C, can be predicted in good agreement with experimental measurements. These results indicate that accurate simulation-based estimates of absolute folding stabilities are within reach, with implications for the computational design of folded miniproteins and peptidomimetics.
马尔可夫状态模型(MSMs)已广泛应用于通过预测从短分子模拟的集合中长时间尺度动力学来理解蛋白质折叠机制。大多数 MSM 估计器强制实行详细平衡,假设轨迹数据在平衡时进行采样。然而,对于从头折叠研究来说,这种情况很少见,因此,MSMs 可能会严重低估此类数据中的蛋白质折叠稳定性。为了解决这个问题,我们开发了一种增强采样协议,其中包括:(1) 进行无偏折叠模拟,并使用稀疏 tICA 获得最佳捕获折叠中最慢事件的特征;(2) 沿着此反应坐标进行伞式采样以观察折叠和展开跃迁;以及 (3) 使用多集合马尔可夫模型(MEMMs)估计折叠的热力学和动力学。使用此协议,可以很好地预测设计的α-螺旋发夹 Z34C 的折叠途径、速率和稳定性,与实验测量结果一致。这些结果表明,准确的基于模拟的绝对折叠稳定性估计是可行的,这对折叠小蛋白和肽模拟物的计算设计具有重要意义。