Tianjin Key Laboratory of Combinatorics, Nankai University Tianjin 300071, People's Republic of China.
Bioinformatics. 2011 Feb 15;27(4):456-63. doi: 10.1093/bioinformatics/btq659. Epub 2010 Dec 5.
Many computerized methods for RNA-RNA interaction structure prediction have been developed. Recently, O(N(6)) time and O(N(4)) space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes. However, few of these methods incorporate the knowledge concerning related sequences, thus relevant evolutionary information is often neglected from the structure determination. Therefore, it is of considerable practical interest to introduce a method taking into consideration both: thermodynamic stability as well as sequence/structure covariation.
We present the a priori folding algorithm ripalign, whose input consists of two (given) multiple sequence alignments (MSA). ripalign outputs (i) the partition function, (ii) base pairing probabilities, (iii) hybrid probabilities and (iv) a set of Boltzmann-sampled suboptimal structures consisting of canonical joint structures that are compatible to the alignments. Compared to the single sequence-pair folding algorithm rip, ripalign requires negligible additional memory resource but offers much better sensitivity and specificity, once alignments of suitable quality are given. ripalign additionally allows to incorporate structure constraints as input parameters.
The algorithm described here is implemented in C as part of the rip package.
已经开发出许多用于 RNA-RNA 相互作用结构预测的计算机化方法。最近,出现了计算 RNA-RNA 相互作用复合物分区函数的 O(N(6))时间和 O(N(4))空间动态编程算法。然而,这些方法中的大多数都没有结合有关序列的知识,因此相关的进化信息通常在结构确定中被忽略。因此,引入一种同时考虑热力学稳定性和序列/结构共变的方法具有相当大的实际意义。
我们提出了预先折叠算法 ripalign,其输入包括两个(给定的)多序列比对(MSA)。ripaign 输出(i)分区函数,(ii)碱基配对概率,(iii)杂交概率和(iv)一组由与比对兼容的规范联合结构组成的玻尔兹曼抽样的次优结构。与单个序列对折叠算法 rip 相比,一旦提供了质量合适的比对,ripaign 所需的额外内存资源可忽略不计,但灵敏度和特异性要好得多。ripaign 还允许将结构约束作为输入参数。
此处描述的算法作为 rip 包的一部分以 C 语言实现。