Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany;
Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany.
Proc Natl Acad Sci U S A. 2017 Aug 1;114(31):8265-8270. doi: 10.1073/pnas.1704803114. Epub 2017 Jul 17.
Accurate mechanistic description of structural changes in biomolecules is an increasingly important topic in structural and chemical biology. Markov models have emerged as a powerful way to approximate the molecular kinetics of large biomolecules while keeping full structural resolution in a divide-and-conquer fashion. However, the accuracy of these models is limited by that of the force fields used to generate the underlying molecular dynamics (MD) simulation data. Whereas the quality of classical MD force fields has improved significantly in recent years, remaining errors in the Boltzmann weights are still on the order of a few [Formula: see text], which may lead to significant discrepancies when comparing to experimentally measured rates or state populations. Here we take the view that simulations using a sufficiently good force-field sample conformations that are valid but have inaccurate weights, yet these weights may be made accurate by incorporating experimental data a posteriori. To do so, we propose augmented Markov models (AMMs), an approach that combines concepts from probability theory and information theory to consistently treat systematic force-field error and statistical errors in simulation and experiment. Our results demonstrate that AMMs can reconcile conflicting results for protein mechanisms obtained by different force fields and correct for a wide range of stationary and dynamical observables even when only equilibrium measurements are incorporated into the estimation process. This approach constitutes a unique avenue to combine experiment and computation into integrative models of biomolecular structure and dynamics.
准确描述生物分子的结构变化是结构和化学生物学中日益重要的课题。马尔可夫模型已成为一种强大的方法,可以在保持完全结构分辨率的前提下,以分而治之的方式近似大分子的分子动力学。然而,这些模型的准确性受到用于生成基础分子动力学(MD)模拟数据的力场的限制。尽管近年来经典 MD 力场的质量有了显著提高,但 Boltzmann 权重中的剩余误差仍然在几个[Formula: see text]的量级,这可能导致与实验测量的速率或状态群体相比存在显著差异。在这里,我们认为,使用足够好的力场样本进行的模拟可以形成有效的构象,但权重不准确,但通过事后纳入实验数据,可以使这些权重变得准确。为此,我们提出了增强的马尔可夫模型(AMM),这是一种结合概率论和信息论概念的方法,可以一致地处理模拟和实验中的系统力场误差和统计误差。我们的结果表明,AMM 可以协调不同力场对蛋白质机制的相互矛盾的结果,并纠正广泛的静态和动态观测值,即使仅将平衡测量纳入估计过程。这种方法构成了将实验和计算结合到生物分子结构和动力学综合模型中的独特途径。