Biomedical Sciences Program, University at Albany, Albany, New York 12208, USA.
RNA. 2010 Jun;16(6):1108-17. doi: 10.1261/rna.1988510. Epub 2010 Apr 22.
Structure mapping experiments (using probes such as dimethyl sulfate [DMS], kethoxal, and T1 and V1 RNases) are used to determine the secondary structures of RNA molecules. The process is iterative, combining the results of several probes with constrained minimum free-energy calculations to produce a model of the structure. We aim to evaluate whether particular probes provide more structural information, and specifically, how noise in the data affects the predictions. Our approach involves generating "decoy" RNA structures (using the sFold Boltzmann sampling procedure) and evaluating whether we are able to identify the correct structure from this ensemble of structures. We show that with perfect information, we are always able to identify the optimal structure for five RNAs of known structure. We then collected orthogonal structure mapping data (DMS and RNase T1 digest) under several solution conditions using our high-throughput capillary automated footprinting analysis (CAFA) technique on two group I introns of known structure. Analysis of these data reveals the error rates in the data under optimal (low salt) and suboptimal solution conditions (high MgCl(2)). We show that despite these errors, our computational approach is less sensitive to experimental noise than traditional constraint-based structure prediction algorithms. Finally, we propose a novel approach for visualizing the interaction of chemical and enzymatic mapping data with RNA structure. We project the data onto the first two dimensions of a multidimensional scaling of the sFold-generated decoy structures. We are able to directly visualize the structural information content of structure mapping data and reconcile multiple data sets.
结构映射实验(使用二甲磺酸[DMS]、酮肟和 T1 和 V1 RNase 等探针)用于确定 RNA 分子的二级结构。该过程是迭代的,将几个探针的结果与受约束的最小自由能计算相结合,以产生结构模型。我们旨在评估特定的探针是否提供更多的结构信息,特别是数据中的噪声如何影响预测。我们的方法涉及生成“诱饵”RNA 结构(使用 sFold Boltzmann 采样程序),并评估我们是否能够从该结构集合中识别正确的结构。我们表明,在具有完美信息的情况下,我们总是能够从五个已知结构的 RNA 中识别出最佳结构。然后,我们使用我们的高通量毛细管自动足迹分析(CAFA)技术在两个已知结构的 I 组内含子上,在几种溶液条件下收集正交结构映射数据(DMS 和 RNase T1 消化)。对这些数据的分析揭示了在最佳(低盐)和次优(高 MgCl2)溶液条件下数据中的误差率。我们表明,尽管存在这些误差,我们的计算方法比传统的基于约束的结构预测算法对实验噪声的敏感性更低。最后,我们提出了一种新的方法来可视化化学和酶促映射数据与 RNA 结构的相互作用。我们将数据投影到 sFold 生成的诱饵结构多维标度的前两个维度上。我们能够直接可视化结构映射数据的结构信息含量,并协调多个数据集。