Herringer Nicholas S M, Dasetty Siva, Gandhi Diya, Lee Junhee, Ferguson Andrew L
Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States.
Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States.
J Chem Theory Comput. 2024 Jan 9;20(1):178-198. doi: 10.1021/acs.jctc.3c00923. Epub 2023 Dec 27.
The typically rugged nature of molecular free-energy landscapes can frustrate efficient sampling of the thermodynamically relevant phase space due to the presence of high free-energy barriers. Enhanced sampling techniques can improve phase space exploration by accelerating sampling along particular collective variables (CVs). A number of techniques exist for the data-driven discovery of CVs parametrizing the important large-scale motions of the system. A challenge to CV discovery is learning CVs invariant to the symmetries of the molecular system, frequently rigid translation, rigid rotation, and permutational relabeling of identical particles. Of these, permutational invariance has proved a persistent challenge in frustrating the data-driven discovery of multimolecular CVs in systems of self-assembling particles and solvent-inclusive CVs for solvated systems. In this work, we integrate permutation invariant vector (PIV) featurizations with autoencoding neural networks to learn nonlinear CVs invariant to translation, rotation, and permutation and perform interleaved rounds of CV discovery and enhanced sampling to iteratively expand the sampling of configurational phase space and obtain converged CVs and free-energy landscapes. We demonstrate the permutationally invariant network for enhanced sampling (PINES) approach in applications to the self-assembly of a 13-atom argon cluster, association/dissociation of a NaCl ion pair in water, and hydrophobic collapse of a CH -pentatetracontane polymer chain. We make the approach freely available as a new module within the PLUMED2 enhanced sampling libraries.
由于存在高自由能垒,分子自由能景观通常崎岖不平的性质会阻碍对热力学相关相空间进行有效的采样。增强采样技术可以通过加速沿特定集体变量(CV)的采样来改善相空间探索。存在多种用于数据驱动发现参数化系统重要大规模运动的CV的技术。CV发现面临的一个挑战是学习对分子系统对称性不变的CV,这些对称性通常包括刚性平移、刚性旋转以及相同粒子的排列重标记。其中,排列不变性在阻碍自组装粒子系统中多分子CV以及溶剂化系统中包含溶剂的CV的数据驱动发现方面一直是一个挑战。在这项工作中,我们将排列不变向量(PIV)特征化与自动编码神经网络相结合,以学习对平移、旋转和排列不变的非线性CV,并执行交错的CV发现和增强采样轮次,以迭代地扩展构型相空间的采样,并获得收敛的CV和自由能景观。我们在13原子氩团簇的自组装、水中NaCl离子对的缔合/解离以及CH -五十烷聚合物链的疏水塌缩等应用中展示了用于增强采样的排列不变网络(PINES)方法。我们将该方法作为PLUMED2增强采样库中的一个新模块免费提供。