Das Rhiju, Baker David
Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, Box 357350, Seattle, WA 98195, USA.
Proc Natl Acad Sci U S A. 2007 Sep 11;104(37):14664-9. doi: 10.1073/pnas.0703836104. Epub 2007 Aug 28.
RNA tertiary structure prediction has been based almost entirely on base-pairing constraints derived from phylogenetic covariation analysis. We describe here a complementary approach, inspired by the Rosetta low-resolution protein structure prediction method, that seeks the lowest energy tertiary structure for a given RNA sequence without using evolutionary information. In a benchmark test of 20 RNA sequences with known structure and lengths of approximately 30 nt, the new method reproduces better than 90% of Watson-Crick base pairs, comparable with the accuracy of secondary structure prediction methods. In more than half the cases, at least one of the top five models agrees with the native structure to better than 4 A rmsd over the backbone. Most importantly, the method recapitulates more than one-third of non-Watson-Crick base pairs seen in the native structures. Tandem stacks of "sheared" base pairs, base triplets, and pseudoknots are among the noncanonical features reproduced in the models. In the cases in which none of the top five models were native-like, higher energy conformations similar to the native structures are still sampled frequently but not assigned low energies. These results suggest that modest improvements in the energy function, together with the incorporation of information from phylogenetic covariance, may allow confident and accurate structure prediction for larger and more complex RNA chains.
RNA三级结构预测几乎完全基于系统发育共变分析得出的碱基配对限制。我们在此描述一种受Rosetta低分辨率蛋白质结构预测方法启发的补充方法,该方法在不使用进化信息的情况下为给定的RNA序列寻找能量最低的三级结构。在对20个已知结构且长度约为30个核苷酸的RNA序列进行的基准测试中,新方法重现了超过90%的沃森-克里克碱基对,与二级结构预测方法的准确性相当。在超过一半的情况下,前五个模型中至少有一个与天然结构的主链均方根偏差(rmsd)优于4埃。最重要的是,该方法重现了天然结构中超过三分之一的非沃森-克里克碱基对。模型中重现的非经典特征包括“剪切”碱基对的串联堆积、碱基三联体和假结。在五个最佳模型均与天然结构不符的情况下,与天然结构相似的高能量构象仍经常被采样,但未被赋予低能量。这些结果表明,能量函数的适度改进,以及系统发育共变信息的纳入,可能会使我们有信心且准确地预测更大、更复杂的RNA链的结构。