Siebert Sven, Backofen Rolf
Department of Bioinformatics, Institute of Computer Science, Friedrich-Schiller-University Jena, Ernst-Abbe Platz 2, 07743 Jena, Germany.
Bioinformatics. 2005 Aug 15;21(16):3352-9. doi: 10.1093/bioinformatics/bti550. Epub 2005 Jun 21.
Due to the importance of considering secondary structures in aligning functional RNAs, several pairwise sequence-structure alignment methods have been developed. They use extended alignment scores that evaluate secondary structure information in addition to sequence information. However, two problems for the multiple alignment step remain. First, how to combine pairwise sequence-structure alignments into a multiple alignment and second, how to generate secondary structure information for sequences whose explicit structural information is missing.
We describe a novel approach for multiple alignment of RNAs (MARNA) taking into consideration both the primary and the secondary structures. It is based on pairwise sequence-structure comparisons of RNAs. From these sequence-structure alignments, libraries of weighted alignment edges are generated. The weights reflect the sequential and structural conservation. For sequences whose secondary structures are missing, the libraries are generated by sampling low energy conformations. The libraries are then processed by the T-Coffee system, which is a consistency based multiple alignment method. Furthermore, we are able to extract a consensus-sequence and -structure from a multiple alignment. We have successfully tested MARNA on several datasets taken from the Rfam database.
由于在比对功能性RNA时考虑二级结构很重要,因此已经开发了几种两两序列-结构比对方法。这些方法使用扩展比对分数,除了序列信息外,还能评估二级结构信息。然而,多序列比对步骤仍存在两个问题。第一,如何将两两序列-结构比对合并为多序列比对;第二,如何为缺少明确结构信息的序列生成二级结构信息。
我们描述了一种考虑一级和二级结构的RNA多序列比对新方法(MARNA)。它基于RNA的两两序列-结构比较。从这些序列-结构比对中,生成加权比对边的库。权重反映了序列和结构的保守性。对于缺少二级结构的序列,通过对低能量构象进行采样来生成库。然后,这些库由T-Coffee系统处理,该系统是一种基于一致性的多序列比对方法。此外,我们能够从多序列比对中提取共有序列和结构。我们已在从Rfam数据库获取的几个数据集上成功测试了MARNA。