Department of Chemistry , Stanford University , 318 Campus Drive , Stanford , California 94305 , United States.
J Phys Chem B. 2018 May 31;122(21):5291-5299. doi: 10.1021/acs.jpcb.7b06896. Epub 2017 Oct 3.
We recently showed that the time-structure-based independent component analysis method from Markov state model literature provided a set of variationally optimal slow collective variables for metadynamics (tICA-metadynamics). In this paper, we extend the methodology toward efficient sampling of related mutants by borrowing ideas from transfer learning methods in machine learning. Our method explicitly assumes that a similar set of slow modes and metastable states is found in both the wild type (baseline) and its mutants. Under this assumption, we describe a few simple techniques using sequence mapping for transferring the slow modes and structural information contained in the wild type simulation to a mutant model for performing enhanced sampling. The resulting simulations can then be reweighted onto the full-phase space using the multistate Bennett acceptance ratio, allowing for thermodynamic comparison against the wild type. We first benchmark our methodology by recapturing alanine dipeptide dynamics across a range of different atomistic force fields, including the polarizable Amoeba force field, after learning a set of slow modes using Amber ff99sb-ILDN. We next extend the method by including structural data from the wild type simulation and apply the technique to recapturing the effects of the GTT mutation on the FIP35 WW domain.
我们最近表明,来自马尔可夫状态模型文献的基于时间结构的独立成分分析方法为元动力学(tICA-元动力学)提供了一组变分最优的慢集体变量。在本文中,我们通过借鉴机器学习中的迁移学习方法的思想,将该方法扩展到相关突变体的有效采样中。我们的方法明确假设在野生型(基线)及其突变体中都发现了类似的慢模式和亚稳态。在此假设下,我们描述了几种使用序列映射的简单技术,用于将野生型模拟中包含的慢模式和结构信息转移到突变体模型中,以进行增强采样。然后可以使用多态贝内特接受比将结果模拟重新加权到全相空间中,以与野生型进行热力学比较。我们首先通过在不同的原子力场(包括可极化的Amoeba 力场)中学习一组慢模式后,重新捕获丙氨酸二肽动力学,来验证我们的方法。接下来,我们通过包括来自野生型模拟的结构数据来扩展该方法,并将该技术应用于重新捕获 GTT 突变对 FIP35 WW 结构域的影响。