Department of Mathematics, Center for Combinatorics, Key Laboratory of Pure Mathematics and Combinatorics, College of Life Science, Nankai University Tianjin 300071, PR China.
Bioinformatics. 2011 Apr 15;27(8):1076-85. doi: 10.1093/bioinformatics/btr090. Epub 2011 Feb 17.
Several dynamic programming algorithms for predicting RNA structures with pseudoknots have been proposed that differ dramatically from one another in the classes of structures considered.
Here, we use the natural topological classification of RNA structures in terms of irreducible components that are embeddable in the surfaces of fixed genus. We add to the conventional secondary structures four building blocks of genus one in order to construct certain structures of arbitrarily high genus. A corresponding unambiguous multiple context-free grammar provides an efficient dynamic programming approach for energy minimization, partition function and stochastic sampling. It admits a topology-dependent parametrization of pseudoknot penalties that increases the sensitivity and positive predictive value of predicted base pairs by 10-20% compared with earlier approaches. More general models based on building blocks of higher genus are also discussed.
The source code of gfold is freely available at http://www.combinatorics.cn/cbpc/gfold.tar.gz.
Supplementary data are available at Bioinformatics online.
已经提出了几种用于预测具有假结的 RNA 结构的动态规划算法,这些算法在考虑的结构类别上彼此有很大的不同。
在这里,我们使用 RNA 结构的自然拓扑分类,根据可嵌入固定 genus 曲面的不可约分量。我们为常规二级结构添加了四个 genus 为一的构建块,以便构建任意高 genus 的某些结构。相应的明确的多重上下文无关语法为能量最小化、配分函数和随机采样提供了一种有效的动态规划方法。与早期方法相比,它允许对假结惩罚进行拓扑相关的参数化,从而将预测碱基对的灵敏度和阳性预测值提高 10-20%。还讨论了基于更高 genus 构建块的更通用模型。
gfold 的源代码可在 http://www.combinatorics.cn/cbpc/gfold.tar.gz 上免费获得。
补充数据可在 Bioinformatics 在线获得。