Department of Biochemistry, Stanford University, Stanford, California 94305, USA.
Biochemistry. 2011 Sep 20;50(37):8049-56. doi: 10.1021/bi200524n. Epub 2011 Aug 25.
Single-nucleotide-resolution chemical mapping for structured RNA is being rapidly advanced by new chemistries, faster readouts, and coupling to computational algorithms. Recent tests have shown that selective 2'-hydroxyl acylation by primer extension (SHAPE) can give near-zero error rates (0-2%) in modeling the helices of RNA secondary structure. Here, we benchmark the method using six molecules for which crystallographic data are available: tRNA(phe) and 5S rRNA from Escherichia coli, the P4-P6 domain of the Tetrahymena group I ribozyme, and ligand-bound domains from riboswitches for adenine, cyclic di-GMP, and glycine. SHAPE-directed modeling of these highly structured RNAs gave an overall false negative rate (FNR) of 17% and a false discovery rate (FDR) of 21%, with at least one helix prediction error in five of the six cases. Extensive variations of data processing, normalization, and modeling parameters did not significantly mitigate modeling errors. Only one varation, filtering out data collected with deoxyinosine triphosphate during primer extension, gave a modest improvement (FNR = 12%, and FDR = 14%). The residual structure modeling errors are explained by the insufficient information content of these RNAs' SHAPE data, as evaluated by a nonparametric bootstrapping analysis. Beyond these benchmark cases, bootstrapping suggests a low level of confidence (<50%) in the majority of helices in a previously proposed SHAPE-directed model for the HIV-1 RNA genome. Thus, SHAPE-directed RNA modeling is not always unambiguous, and helix-by-helix confidence estimates, as described herein, may be critical for interpreting results from this powerful methodology.
单核苷酸分辨率的化学作图技术在新化学、更快的读取和与计算算法的耦合方面正在迅速发展。最近的测试表明,通过引物延伸的选择性 2'-羟基酰化(SHAPE)可以在建模 RNA 二级结构的螺旋方面提供接近零的错误率(0-2%)。在这里,我们使用六个具有晶体学数据的分子来对该方法进行基准测试:来自大肠杆菌的 tRNA(phe)和 5S rRNA、四膜虫 I 类核酶的 P4-P6 结构域,以及结合了腺嘌呤、环二鸟苷酸和甘氨酸的核酶开关的配体结合结构域。对这些高度结构化的 RNA 的 SHAPE 定向建模给出了 17%的总体假阴性率(FNR)和 21%的假发现率(FDR),在六个案例中有五个案例中至少有一个螺旋预测错误。数据处理、归一化和建模参数的广泛变化并没有显著减轻建模错误。只有一种变化,即过滤掉引物延伸过程中用脱氧肌苷三磷酸收集的数据,才有适度的改善(FNR = 12%,FDR = 14%)。通过非参数自举分析评估,这些 RNA 的 SHAPE 数据的信息量不足,解释了剩余的结构建模错误。超出这些基准案例,自举表明在之前提出的 HIV-1 RNA 基因组的 SHAPE 定向模型中,大多数螺旋的置信度较低(<50%)。因此,SHAPE 定向的 RNA 建模并不总是明确的,如本文所述的螺旋对螺旋置信度估计对于解释该强大方法的结果可能至关重要。