Strahan John, Lorpaiboon Chatipat, Weare Jonathan, Dinner Aaron R
Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA.
Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA.
J Chem Phys. 2024 Aug 28;161(8). doi: 10.1063/5.0215975.
An issue for molecular dynamics simulations is that events of interest often involve timescales that are much longer than the simulation time step, which is set by the fastest timescales of the model. Because of this timescale separation, direct simulation of many events is prohibitively computationally costly. This issue can be overcome by aggregating information from many relatively short simulations that sample segments of trajectories involving events of interest. This is the strategy of Markov state models (MSMs) and related approaches, but such methods suffer from approximation error because the variables defining the states generally do not capture the dynamics fully. By contrast, once converged, the weighted ensemble (WE) method aggregates information from trajectory segments so as to yield unbiased estimates of both thermodynamic and kinetic statistics. Unfortunately, errors decay no faster than unbiased simulation in WE as originally formulated and commonly deployed. Here, we introduce a theoretical framework for describing WE that shows that the introduction of an approximate stationary distribution on top of the stratification, as in nonequilibrium umbrella sampling (NEUS), accelerates convergence. Building on ideas from MSMs and related methods, we generalize the NEUS approach in such a way that the approximation error can be reduced systematically. We show that the improved algorithm can decrease the simulation time required to achieve the desired precision by orders of magnitude.
分子动力学模拟面临的一个问题是,感兴趣的事件通常涉及比模拟时间步长得多的时间尺度,而模拟时间步长是由模型中最快的时间尺度设定的。由于这种时间尺度分离,对许多事件进行直接模拟在计算上成本过高。这个问题可以通过汇总来自许多相对较短模拟的信息来克服,这些模拟对涉及感兴趣事件的轨迹段进行采样。这是马尔可夫状态模型(MSM)及相关方法的策略,但此类方法存在近似误差,因为定义状态的变量通常不能完全捕捉动力学。相比之下,一旦收敛,加权系综(WE)方法会汇总轨迹段的信息,以便对热力学和动力学统计量产生无偏估计。不幸的是,按照最初制定并普遍应用的WE方法,误差衰减速度并不比无偏模拟快。在此,我们引入一个描述WE的理论框架,该框架表明,如同在非平衡伞形采样(NEUS)中那样,在分层之上引入近似平稳分布会加速收敛。基于MSM及相关方法的思路,我们对NEUS方法进行推广,使得近似误差能够系统地降低。我们表明,改进后的算法能够将达到所需精度所需的模拟时间减少几个数量级。