Moritsugu Kei
Graduate School of Medical Life Science, Yokohama City University, Yokohama 230-0045, Japan.
Life (Basel). 2021 Oct 12;11(10):1076. doi: 10.3390/life11101076.
Multiscale enhanced sampling (MSES) allows for an enhanced sampling of all-atom protein structures by coupling with the accelerated dynamics of the associated coarse-grained (CG) model. In this paper, we propose an MSES extension to replace the CG model with the dynamics on the reduced subspace generated by a machine learning approach, the variational autoencoder (VAE). The molecular dynamic (MD) trajectories of the ribose-binding protein (RBP) in both the closed and open forms were used as the input by extracting the inter-residue distances as the structural features in order to train the VAE model, allowing the encoded latent layer to characterize the difference in the structural dynamics of the closed and open forms. The interpolated data characterizing the RBP structural change in between the closed and open forms were thus efficiently generated in the low-dimensional latent space of the VAE, which was then decoded into the time-series data of the inter-residue distances and was useful for driving the structural sampling at an atomistic resolution via the MSES scheme. The free energy surfaces on the latent space demonstrated the refinement of the generated data that had a single basin into the simulated data containing two closed and open basins, thus illustrating the usefulness of the MD simulation together with the molecular mechanics force field in recovering the correct structural ensemble.
多尺度增强采样(MSES)通过与相关粗粒度(CG)模型的加速动力学相结合,实现了对全原子蛋白质结构的增强采样。在本文中,我们提出了一种MSES扩展,用由机器学习方法变分自编码器(VAE)生成的约化子空间上的动力学来取代CG模型。核糖结合蛋白(RBP)的闭合和开放形式的分子动力学(MD)轨迹被用作输入,通过提取残基间距离作为结构特征来训练VAE模型,使编码的潜在层能够表征闭合和开放形式的结构动力学差异。因此,在VAE的低维潜在空间中有效地生成了表征RBP在闭合和开放形式之间结构变化的插值数据,然后将其解码为残基间距离的时间序列数据,这对于通过MSES方案在原子分辨率下驱动结构采样很有用。潜在空间上的自由能面表明,生成的数据从具有单个盆地细化为包含两个闭合和开放盆地的模拟数据,从而说明了MD模拟与分子力学力场在恢复正确结构系综方面的有用性。