Department of Physics, Eidgenössische Technische Hochschule (ETH) Zürich, 8092 Zürich, Switzerland;
Atomistic Simulations, Italian Institute of Technology, 16163 Genova, Italy.
Proc Natl Acad Sci U S A. 2021 Nov 2;118(44). doi: 10.1073/pnas.2113533118.
The development of enhanced sampling methods has greatly extended the scope of atomistic simulations, allowing long-time phenomena to be studied with accessible computational resources. Many such methods rely on the identification of an appropriate set of collective variables. These are meant to describe the system's modes that most slowly approach equilibrium under the action of the sampling algorithm. Once identified, the equilibration of these modes is accelerated by the enhanced sampling method of choice. An attractive way of determining the collective variables is to relate them to the eigenfunctions and eigenvalues of the transfer operator. Unfortunately, this requires knowing the long-term dynamics of the system beforehand, which is generally not available. However, we have recently shown that it is indeed possible to determine efficient collective variables starting from biased simulations. In this paper, we bring the power of machine learning and the efficiency of the recently developed on the fly probability-enhanced sampling method to bear on this approach. The result is a powerful and robust algorithm that, given an initial enhanced sampling simulation performed with trial collective variables or generalized ensembles, extracts transfer operator eigenfunctions using a neural network ansatz and then accelerates them to promote sampling of rare events. To illustrate the generality of this approach, we apply it to several systems, ranging from the conformational transition of a small molecule to the folding of a miniprotein and the study of materials crystallization.
增强采样方法的发展极大地扩展了原子模拟的范围,使得长时间的现象可以用可访问的计算资源来研究。许多这样的方法依赖于识别一组适当的集体变量。这些变量旨在描述在采样算法的作用下最缓慢接近平衡的系统模式。一旦确定,这些模式的平衡就可以通过选择的增强采样方法来加速。确定集体变量的一种有吸引力的方法是将它们与转移算子的本征函数和本征值联系起来。不幸的是,这需要事先了解系统的长期动力学,而这通常是不可用的。然而,我们最近已经表明,从有偏差的模拟中确实可以确定有效的集体变量。在本文中,我们将机器学习的强大功能和最近开发的即时概率增强采样方法的效率应用于这种方法。结果是一种强大而稳健的算法,给定初始的使用试探性集体变量或广义系综的增强采样模拟,使用神经网络方法提取转移算子本征函数,然后加速它们以促进稀有事件的采样。为了说明这种方法的通用性,我们将其应用于几个系统,从小分子的构象转变到小蛋白的折叠以及材料结晶的研究。