Hurst Travis, Zhang Dong, Zhou Yuanzhe, Chen Shi-Jie
Department of Physics, University of Missouri-Columbia, Columbia, MO 65211, USA.
Department of Physics, University of Missouri-Columbia.
Commun Inf Syst. 2021;21(1):65-83. doi: 10.4310/cis.2021.v21.n1.a4.
Because of their potential utility in predicting conformational changes and assessing folding dynamics, coarse-grained (CG) RNA folding models are appealing for rapid characterization of RNA molecules. Previously, we reported the iterative simulated RNA reference state (IsRNA) method for parameterizing a CG force field for RNA folding, which consecutively updates the simulation force field to reflect marginal distributions of folding coordinates in the structure database and extract various energy terms. While the IsRNA model was validated by showing close agreement between the IsRNA-simulated and experimentally observed distributions, here, we expand our theoretical understanding of the model and, in doing so, improve the parameterization process to optimize the subset of included folding coordinates, which leads to accelerated simulations. Using statistical mechanical theory, we analyze the underlying, Bayesian concept that drives parameterization of the energy function, providing a general method for developing predictive, knowledge-based, polymer force fields on the basis of limited data. Furthermore, we propose an optimal parameterization procedure, based on the principal of maximum entropy.
由于粗粒度(CG)RNA折叠模型在预测构象变化和评估折叠动力学方面具有潜在效用,因此对于快速表征RNA分子很有吸引力。此前,我们报道了用于为RNA折叠参数化CG力场的迭代模拟RNA参考状态(IsRNA)方法,该方法连续更新模拟力场以反映结构数据库中折叠坐标的边际分布并提取各种能量项。虽然通过显示IsRNA模拟分布与实验观察到的分布之间的密切一致性验证了IsRNA模型,但在此我们扩展了对该模型的理论理解,并在此过程中改进参数化过程以优化所包含折叠坐标的子集,从而加速模拟。利用统计力学理论,我们分析了驱动能量函数参数化的潜在贝叶斯概念,提供了一种基于有限数据开发预测性、基于知识的聚合物力场的通用方法。此外,我们基于最大熵原理提出了一种最优参数化程序。