Integrated Drug Discovery, Molecular Design Sciences, Sanofi, Vitry-sur-Seine, France.
CERMICS, Ecole des Ponts, Marne-la-Vallée, France.
J Chem Phys. 2023 Jul 14;159(2). doi: 10.1063/5.0151053.
The heat shock protein 90 (Hsp90) is a molecular chaperone that controls the folding and activation of client proteins using the free energy of ATP hydrolysis. The Hsp90 active site is in its N-terminal domain (NTD). Our goal is to characterize the dynamics of NTD using an autoencoder-learned collective variable (CV) in conjunction with adaptive biasing force Langevin dynamics. Using dihedral analysis, we cluster all available experimental Hsp90 NTD structures into distinct native states. We then perform unbiased molecular dynamics (MD) simulations to construct a dataset that represents each state and use this dataset to train an autoencoder. Two autoencoder architectures are considered, with one and two hidden layers, respectively, and bottlenecks of dimension k ranging from 1 to 10. We demonstrate that the addition of an extra hidden layer does not significantly improve the performance, while it leads to complicated CVs that increase the computational cost of biased MD calculations. In addition, a two-dimensional (2D) bottleneck can provide enough information of the different states, while the optimal bottleneck dimension is five. For the 2D bottleneck, the 2D CV is directly used in biased MD simulations. For the five-dimensional (5D) bottleneck, we perform an analysis of the latent CV space and identify the pair of CV coordinates that best separates the states of Hsp90. Interestingly, selecting a 2D CV out of the 5D CV space leads to better results than directly learning a 2D CV and allows observation of transitions between native states when running free energy biased dynamics.
热休克蛋白 90(Hsp90)是一种分子伴侣,它利用 ATP 水解的自由能控制客户蛋白的折叠和激活。Hsp90 的活性部位位于其 N 端结构域(NTD)。我们的目标是使用自动编码器学习的集体变量(CV)结合自适应偏置力拉氏动力学来描述 NTD 的动力学。通过二面角分析,我们将所有可用的实验 Hsp90 NTD 结构聚类为不同的天然状态。然后,我们进行无偏分子动力学(MD)模拟,构建一个代表每个状态的数据集,并使用该数据集训练自动编码器。考虑了两种自动编码器架构,分别具有一个和两个隐藏层,以及从 1 到 10 的瓶颈维度 k。我们证明,添加额外的隐藏层并不会显著提高性能,而会导致复杂的 CV,从而增加有偏 MD 计算的计算成本。此外,二维(2D)瓶颈可以提供足够的不同状态信息,而最佳的瓶颈维度为五。对于 2D 瓶颈,直接在有偏 MD 模拟中使用 2D CV。对于五维(5D)瓶颈,我们对潜在 CV 空间进行分析,并确定最佳分离 Hsp90 状态的 CV 坐标对。有趣的是,从 5D CV 空间中选择 2D CV 比直接学习 2D CV 会产生更好的结果,并允许在运行自由能有偏动力学时观察到天然状态之间的转变。