Chemical Sciences Division, National Institute of Standards and Technology, Gaithersburg, Maryland 20899-8320, United States.
J Chem Theory Comput. 2022 Jun 14;18(6):3622-3636. doi: 10.1021/acs.jctc.2c00110. Epub 2022 May 25.
Discovering meaningful collective variables for enhancing sampling, via applied biasing potentials or tailored MC move sets, remains a major challenge within molecular simulation. While recent studies identifying collective variables with variational autoencoders (VAEs) have focused on the encoding and latent space discovered by a VAE, the impact of the decoding and its ability to act as a generative model remains unexplored. We demonstrate how VAEs may be used to learn (on-the-fly and with minimal human intervention) highly efficient, collective Monte Carlo moves that accelerate sampling along the learned collective variable. In contrast to many machine learning-based efforts to bias sampling and generate novel configurations, our methods result in exact sampling in the ensemble of interest and do not require reweighting. In fact, we show that the acceptance rates of our moves approach unity for a perfect VAE model. While this is never observed in practice, VAE-based Monte Carlo moves still enhance sampling of new configurations. We demonstrate, however, that the form of the encoding and decoding distributions, in particular the extent to which the decoder reflects the underlying physics, greatly impacts the performance of the trained VAE.
在分子模拟中,通过应用偏置势或定制的 MC 移动集来发现有意义的集体变量以增强采样仍然是一个主要挑战。虽然最近使用变分自动编码器 (VAE) 来识别集体变量的研究集中在 VAE 发现的编码和解码和潜在空间,但解码的影响及其作为生成模型的能力仍未得到探索。我们展示了 VAE 如何用于学习(实时且几乎不需要人工干预)高效的集体蒙特卡罗移动,这些移动沿着学习到的集体变量加速采样。与许多基于机器学习的采样偏差和生成新配置的努力不同,我们的方法导致在感兴趣的集合中进行精确采样,并且不需要重新加权。事实上,我们表明,对于完美的 VAE 模型,我们的移动的接受率接近 1。虽然在实践中从未观察到这一点,但基于 VAE 的 Monte Carlo 移动仍然可以增强新配置的采样。然而,我们证明了编码和解码分布的形式,特别是解码器反映基础物理的程度,极大地影响了训练好的 VAE 的性能。