Deigan Katherine E, Li Tian W, Mathews David H, Weeks Kevin M
Department of Chemistry, University of North Carolina, Chapel Hill, NC 27599-3290, USA.
Proc Natl Acad Sci U S A. 2009 Jan 6;106(1):97-102. doi: 10.1073/pnas.0806929106. Epub 2008 Dec 24.
Almost all RNAs can fold to form extensive base-paired secondary structures. Many of these structures then modulate numerous fundamental elements of gene expression. Deducing these structure-function relationships requires that it be possible to predict RNA secondary structures accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure for a single sequence reliably represents the correct structure, has remained an unsolved problem. Here, we demonstrate that quantitative, nucleotide-resolution information from a SHAPE experiment can be interpreted as a pseudo-free energy change term and used to determine RNA secondary structure with high accuracy. Free energy minimization, by using SHAPE pseudo-free energies, in conjunction with nearest neighbor parameters, predicts the secondary structure of deproteinized Escherichia coli 16S rRNA (>1,300 nt) and a set of smaller RNAs (75-155 nt) with accuracies of up to 96-100%, which are comparable to the best accuracies achievable by comparative sequence analysis.
几乎所有RNA都能折叠形成广泛的碱基配对二级结构。这些结构中的许多随后会调节基因表达的众多基本要素。推断这些结构与功能的关系需要能够准确预测RNA二级结构。然而,对于大型RNA的二级结构预测,即单个序列的单个预测结构能可靠地代表正确结构,仍然是一个未解决的问题。在这里,我们证明来自SHAPE实验的定量、核苷酸分辨率信息可以被解释为一个伪自由能变化项,并用于高精度地确定RNA二级结构。通过使用SHAPE伪自由能结合最近邻参数进行自由能最小化,可预测去蛋白化的大肠杆菌16S rRNA(>1300 nt)和一组较小的RNA(75 - 155 nt)的二级结构,准确率高达96 - 100%,这与通过比较序列分析可达到的最佳准确率相当。