Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina 27599-3290, USA.
RNA. 2010 Jul;16(7):1340-9. doi: 10.1261/rna.1837410. Epub 2010 May 24.
Tertiary structure prediction is important for understanding structure-function relationships for RNAs whose structures are unknown and for characterizing RNA states recalcitrant to direct analysis. However, it is unknown what root-mean-square deviation (RMSD) corresponds to a statistically significant RNA tertiary structure prediction. We use discrete molecular dynamics to generate RNA-like folds for structures up to 161 nucleotides (nt) that have complex tertiary interactions and then determine the RMSD distribution between these decoys. These distributions are Gaussian-like. The mean RMSD increases with RNA length and is smaller if secondary structure constraints are imposed while generating decoys. The compactness of RNA molecules with true tertiary folds is intermediate between closely packed spheres and a freely jointed chain. We use this scaling relationship to define an expression relating RMSD with the confidence that a structure prediction is better than that expected by chance. This is the prediction significance, and corresponds to a P-value. For a 100-nt RNA, the RMSD of predicted structures should be within 25 A of the accepted structure to reach the P <or= 0.01 level if the secondary structure is predicted de novo and within 14 A if secondary structure information is used as a constraint. This significance approach should be useful for evaluating diverse RNA structure prediction and molecular modeling algorithms.
三级结构预测对于理解结构未知的 RNA 的结构-功能关系以及描述难以直接分析的 RNA 状态非常重要。然而,目前尚不清楚均方根偏差 (RMSD) 对应于具有统计学意义的 RNA 三级结构预测的程度。我们使用离散分子动力学来生成具有复杂三级相互作用的结构长达 161 个核苷酸 (nt) 的 RNA 样折叠,然后确定这些诱饵之间的 RMSD 分布。这些分布类似于高斯分布。平均 RMSD 随 RNA 长度的增加而增加,如果在生成诱饵时施加二级结构约束,则 RMSD 更小。具有真实三级结构的 RNA 分子的紧凑性介于紧密堆积的球体和自由连接的链之间。我们使用这种缩放关系来定义一个表达式,将 RMSD 与结构预测优于随机预期的置信度联系起来。这是预测显著性,相当于 P 值。对于 100nt 的 RNA,如果二级结构是从头预测的,则预测结构的 RMSD 应在接受结构的 25 A 以内,才能达到 P<0.01 水平,如果使用二级结构信息作为约束,则 RMSD 应在 14 A 以内。这种显著性方法对于评估各种 RNA 结构预测和分子建模算法应该是有用的。