Mathews D H, Sabina J, Zuker M, Turner D H
Department of Chemistry, University of Rochester, Rochester, NY, 14627-0216, USA.
J Mol Biol. 1999 May 21;288(5):911-40. doi: 10.1006/jmbi.1999.2700.
An improved dynamic programming algorithm is reported for RNA secondary structure prediction by free energy minimization. Thermodynamic parameters for the stabilities of secondary structure motifs are revised to include expanded sequence dependence as revealed by recent experiments. Additional algorithmic improvements include reduced search time and storage for multibranch loop free energies and improved imposition of folding constraints. An extended database of 151,503 nt in 955 structures? determined by comparative sequence analysis was assembled to allow optimization of parameters not based on experiments and to test the accuracy of the algorithm. On average, the predicted lowest free energy structure contains 73 % of known base-pairs when domains of fewer than 700 nt are folded; this compares with 64 % accuracy for previous versions of the algorithm and parameters. For a given sequence, a set of 750 generated structures contains one structure that, on average, has 86 % of known base-pairs. Experimental constraints, derived from enzymatic and flavin mononucleotide cleavage, improve the accuracy of structure predictions.
报道了一种改进的动态规划算法,用于通过最小化自由能预测RNA二级结构。对二级结构基序稳定性的热力学参数进行了修订,以纳入近期实验揭示的扩展序列依赖性。算法的其他改进包括减少多分支环自由能的搜索时间和存储,并改进折叠约束的施加。通过比较序列分析确定了一个包含955个结构、共151,503个核苷酸的扩展数据库,用于优化非基于实验的参数并测试算法的准确性。当折叠少于700个核苷酸的结构域时,预测的最低自由能结构平均包含73%的已知碱基对;相比之下,该算法和参数的先前版本的准确率为64%。对于给定序列,一组750个生成的结构中平均有一个结构包含86%的已知碱基对。源自酶促和黄素单核苷酸切割的实验约束提高了结构预测的准确性。