Yan Shuting, Peck Jason M, Ilgu Muslum, Nilsen-Hamilton Marit, Lamm Monica H
Department of Chemical and Biological Engineering, Iowa State University, Ames, Iowa 50011, United States.
Roy J Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States.
ACS Omega. 2020 Aug 5;5(32):20187-20201. doi: 10.1021/acsomega.0c01867. eCollection 2020 Aug 18.
Using multiple independent simulations instead of one long simulation has been shown to improve the sampling performance attained with the molecular dynamics (MD) simulation method. However, it is generally not known how long each independent simulation should be, how many independent simulations should be used, or to what extent either of these factors affects the overall sampling performance achieved for a given system. The goal of the present study was to assess the sampling performance of multiple independent MD simulations, where each independent simulation begins from a different initial molecular conformation. For this purpose, we used an RNA aptamer that is 25 nucleotides long as a case study. The initial conformations of the aptamer are derived from six predicted 3D structures. Each of the six predicted structures is energy minimized in solution and equilibrated with MD simulations at high temperature. Ten conformations from these six high-temperature equilibration runs are selected as initial conformations for further simulations at ambient temperature. In total, we conducted 60 independent MD simulations, each with a duration of 100 ns, to study the conformation and dynamics of the aptamer. For each group of 10 independent simulations that originated from a particular predicted structure, we evaluated the potential energy distribution of the RNA and used recurrence quantification analysis to examine the sampling of RNA conformational transitions. To assess the impact of starting from different predicted structures, we computed the density of structure projection on principal components to compare the regions sampled by the different groups of ten independent simulations. The recurrence rate and dependence of initial conformation among the groups were also compared. We stress the necessity of using different initial configurations as simulation starting points by showing long simulations from different initial structures suffer from being trapped in different states. Finally, we summarized the sampling efficiency for the complete set of 60 independent simulations and determined regions of under-sampling on the potential energy landscape. The results suggest that conducting multiple independent simulations using a diverse set of predicted structures is a promising approach to achieve sufficient sampling. This approach avoids undesirable outcomes, such as the problem of the RNA aptamer being trapped in a local minimum. For others wishing to conduct multiple independent simulations, the analysis protocol presented in this study is a guide for examining overall sampling and determining if more simulations are necessary for sufficient sampling.
与单次长时间模拟相比,采用多次独立模拟已被证明可提高分子动力学(MD)模拟方法的采样性能。然而,通常并不清楚每次独立模拟应持续多长时间、应使用多少次独立模拟,或者这些因素中的任何一个对给定系统实现的整体采样性能有何影响程度。本研究的目的是评估多次独立MD模拟的采样性能,其中每次独立模拟都从不同的初始分子构象开始。为此,我们以一个25个核苷酸长的RNA适配体作为案例研究。该适配体的初始构象源自六个预测的三维结构。这六个预测结构中的每一个在溶液中进行能量最小化,并在高温下通过MD模拟进行平衡。从这六次高温平衡运行中选择十个构象作为初始构象,用于在环境温度下进行进一步模拟。我们总共进行了60次独立的MD模拟,每次持续100纳秒,以研究该适配体的构象和动力学。对于源自特定预测结构的每组10次独立模拟,我们评估了RNA的势能分布,并使用递归量化分析来检查RNA构象转变的采样情况。为了评估从不同预测结构开始的影响,我们计算了主成分上结构投影的密度,以比较不同的十次独立模拟组所采样的区域。还比较了各组之间的递归率和初始构象的依赖性。我们通过展示从不同初始结构进行的长时间模拟会陷入不同状态,强调了使用不同初始构型作为模拟起点的必要性。最后,我们总结了60次独立模拟的完整集合的采样效率,并确定了势能面上欠采样的区域。结果表明,使用多种预测结构进行多次独立模拟是实现充分采样的一种有前景的方法。这种方法避免了不良结果,例如RNA适配体被困在局部最小值的问题。对于其他希望进行多次独立模拟的人来说,本研究中提出的分析方案是检查整体采样并确定是否需要更多模拟以实现充分采样的指南。