Husic Brooke E, McGibbon Robert T, Sultan Mohammad M, Pande Vijay S
Department of Chemistry, Stanford University, Stanford, California 94305, USA.
J Chem Phys. 2016 Nov 21;145(19):194103. doi: 10.1063/1.4967809.
As molecular dynamics simulations access increasingly longer time scales, complementary advances in the analysis of biomolecular time-series data are necessary. Markov state models offer a powerful framework for this analysis by describing a system's states and the transitions between them. A recently established variational theorem for Markov state models now enables modelers to systematically determine the best way to describe a system's dynamics. In the context of the variational theorem, we analyze ultra-long folding simulations for a canonical set of twelve proteins [K. Lindorff-Larsen et al., Science 334, 517 (2011)] by creating and evaluating many types of Markov state models. We present a set of guidelines for constructing Markov state models of protein folding; namely, we recommend the use of cross-validation and a kinetically motivated dimensionality reduction step for improved descriptions of folding dynamics. We also warn that precise kinetics predictions rely on the features chosen to describe the system and pose the description of kinetic uncertainty across ensembles of models as an open issue.
随着分子动力学模拟能够处理越来越长的时间尺度,对生物分子时间序列数据进行分析的相应进展变得十分必要。马尔可夫状态模型通过描述系统的状态及其之间的转变,为这种分析提供了一个强大的框架。最近为马尔可夫状态模型建立的变分定理,现在使建模者能够系统地确定描述系统动力学的最佳方法。在变分定理的背景下,我们通过创建和评估多种类型的马尔可夫状态模型,分析了一组由12种典型蛋白质构成的超长折叠模拟 [K. 林多夫-拉森等人,《科学》334, 517 (2011)]。我们提出了一套构建蛋白质折叠马尔可夫状态模型的指导原则;也就是说,我们建议使用交叉验证和动力学驱动的降维步骤,以更好地描述折叠动力学。我们还警告说,精确的动力学预测依赖于用于描述系统的特征,并将跨模型集合的动力学不确定性描述作为一个开放问题提出。