Ding Ye, Chan Chi Yu, Lawrence Charles E
Bioinformatics Center, Wadsworth Center, New York State Department of Health, Albany, NY 12208, USA.
RNA. 2005 Aug;11(8):1157-66. doi: 10.1261/rna.2500605.
Prediction of RNA secondary structure by free energy minimization has been the standard for over two decades. Here we describe a novel method that forsakes this paradigm for predictions based on Boltzmann-weighted structure ensemble. We introduce the notion of a centroid structure as a representative for a set of structures and describe a procedure for its identification. In comparison with the minimum free energy (MFE) structure using diverse types of structural RNAs, the centroid of the ensemble makes 30.0% fewer prediction errors as measured by the positive predictive value (PPV) with marginally improved sensitivity. The Boltzmann ensemble can be separated into a small number (3.2 on average) of clusters. Among the centroids of these clusters, the "best cluster centroid" as determined by comparison to the known structure simultaneously improves PPV by 46.5% and sensitivity by 21.7%. For 58% of the studied sequences for which the MFE structure is outside the cluster containing the best centroid, the improvements by the best centroid are 62.5% for PPV and 31.4% for sensitivity. These results suggest that the energy well containing the MFE structure under the current incomplete energy model is often different from the one for the unavailable complete model that presumably contains the unique native structure. Centroids are available on the Sfold server at http://sfold.wadsworth.org.
二十多年来,通过最小化自由能来预测RNA二级结构一直是标准方法。在此,我们描述了一种新方法,该方法摒弃了这种基于玻尔兹曼加权结构集合进行预测的范式。我们引入了质心结构的概念作为一组结构的代表,并描述了其识别过程。与使用不同类型结构RNA的最小自由能(MFE)结构相比,集合的质心在以阳性预测值(PPV)衡量时预测错误减少了30.0%,灵敏度略有提高。玻尔兹曼集合可以分为少量(平均3.2个)簇。在这些簇的质心中,通过与已知结构比较确定的“最佳簇质心”同时将PPV提高了46.5%,灵敏度提高了21.7%。对于58%的研究序列,其MFE结构不在包含最佳质心的簇中,最佳质心对PPV的提高为62.5%,对灵敏度的提高为31.4%。这些结果表明,在当前不完整能量模型下包含MFE结构的能量阱通常与可能包含唯一天然结构的不可用完整模型的能量阱不同。质心可在Sfold服务器(http://sfold.wadsworth.org)上获取。