Department of Biology, LMU Biocenter, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany.
Bioinformatics. 2012 Sep 1;28(17):2242-8. doi: 10.1093/bioinformatics/bts369. Epub 2012 Jul 13.
Today many non-coding RNAs are known to play an active role in various important biological processes. Since RNA's functionality is correlated with specific structural motifs that are often conserved in phylogenetically related molecules, computational prediction of RNA structure should ideally be based on a set of homologous primary structures. But many available RNA secondary structure prediction programs that use sequence alignments do not consider pseudoknots or their estimations consist on a single structure without information on uncertainty.
In this article we present a method that takes advantage of the evolutionary history of a group of aligned RNA sequences for sampling consensus secondary structures, including pseudoknots, according to their approximate posterior probability. We investigate the benefit of using evolutionary history and demonstrate the competitiveness of our method compared with similar methods based on RNase P RNA sequences and simulated data.
PhyloQFold, a C + + implementation of our method, is freely available from http://evol.bio.lmu.de/_statgen/software/phyloqfold/.
如今,许多非编码 RNA 被认为在各种重要的生物过程中发挥着积极的作用。由于 RNA 的功能与特定的结构基序相关,这些基序在系统发育上相关的分子中通常是保守的,因此 RNA 结构的计算预测理想上应该基于一组同源的一级结构。但是,许多现有的使用序列比对的 RNA 二级结构预测程序都不考虑假结,或者它们的估计仅包含单个结构,而没有关于不确定性的信息。
在本文中,我们提出了一种方法,该方法利用一组对齐的 RNA 序列的进化历史来根据它们的近似后验概率对共识二级结构(包括假结)进行采样。我们研究了利用进化历史的好处,并证明了与基于 RNase P RNA 序列和模拟数据的类似方法相比,我们的方法具有竞争力。
PhyloQFold 是我们方法的 C++实现,可从 http://evol.bio.lmu.de/_statgen/software/phyloqfold/ 免费获得。