Scuola Internazionale Superiore di Studi Avanzati, via Bonomea 265, Trieste 34136, Italy.
Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, Brno 612 65, Czech Republic.
J Chem Theory Comput. 2022 Jul 12;18(7):4490-4502. doi: 10.1021/acs.jctc.2c00200. Epub 2022 Jun 14.
The capability of current force fields to reproduce RNA structural dynamics is limited. Several methods have been developed to take advantage of experimental data in order to enforce agreement with experiments. Here, we extend an existing framework which allows arbitrarily chosen force-field correction terms to be fitted by quantification of the discrepancy between observables back-calculated from simulation and corresponding experiments. We apply a robust regularization protocol to avoid overfitting and additionally introduce and compare a number of different regularization strategies, namely, L1, L2, Kish size, relative Kish size, and relative entropy penalties. The training set includes a GACC tetramer as well as more challenging systems, namely, gcGAGAgc and gcUUCGgc RNA tetraloops. Specific intramolecular hydrogen bonds in the AMBER RNA force field are corrected with automatically determined parameters that we call gHBfix. A validation involving a separate simulation of a system present in the training set (gcUUCGgc) and new systems not seen during training (CAAU and UUUU tetramers) displays improvements regarding the native population of the tetraloop as well as good agreement with NMR experiments for tetramers when using the new parameters. Then, we simulate folded RNAs (a kink-turn and L1 stalk rRNA) including hydrogen bond types not sufficiently present in the training set. This allows a final modification of the parameter set which is named gHBfix21 and is suggested to be applicable to a wider range of RNA systems.
当前力场重现 RNA 结构动力学的能力有限。已经开发了几种方法来利用实验数据,以便强制与实验结果保持一致。在这里,我们扩展了一个现有的框架,该框架允许通过量化从模拟回溯计算的可观测值与相应实验之间的差异,来拟合任意选择的力场修正项。我们应用了一种强大的正则化协议来避免过度拟合,此外还引入并比较了几种不同的正则化策略,即 L1、L2、Kish 大小、相对 Kish 大小和相对熵惩罚。训练集包括一个 GACC 四聚体以及更具挑战性的系统,即 gcGAGAgc 和 gcUUCGgc RNA 四肽环。使用我们称之为 gHBfix 的自动确定参数来修正 AMBER RNA 力场中特定的分子内氢键。涉及在训练集中存在的系统(gcUUCGgc)和未在训练中看到的新系统(CAAU 和 UUUU 四聚体)的单独模拟的验证显示,四肽环的天然群体得到了改善,并且当使用新参数时,与 NMR 实验的吻合度也很好。然后,我们模拟了包含在训练集中没有足够出现的氢键类型的折叠 RNA(一个扭结-转弯和 L1 茎 rRNA)。这允许对参数集进行最终修改,该参数集被命名为 gHBfix21,并建议将其应用于更广泛的 RNA 系统。