Andersen Ebbe S, Lind-Thomsen Allan, Knudsen Bjarne, Kristensen Susie E, Havgaard Jakob H, Torarinsson Elfar, Larsen Niels, Zwieb Christian, Sestoft Peter, Kjems Jørgen, Gorodkin Jan
Department of Molecular Biology, University of Aarhus, Arhus C, Denmark.
RNA. 2007 Nov;13(11):1850-9. doi: 10.1261/rna.215407. Epub 2007 Sep 5.
We have developed a semiautomated RNA sequence editor (SARSE) that integrates tools for analyzing RNA alignments. The editor highlights different properties of the alignment by color, and its integrated analysis tools prevent the introduction of errors when doing alignment editing. SARSE readily connects to external tools to provide a flexible semiautomatic editing environment. A new method, Pcluster, is introduced for dividing the sequences of an RNA alignment into subgroups with secondary structure differences. Pcluster was used to evaluate 574 seed alignments obtained from the Rfam database and we identified 71 alignments with significant prediction of inconsistent base pairs and 102 alignments with significant prediction of novel base pairs. Four RNA families were used to illustrate how SARSE can be used to manually or automatically correct the inconsistent base pairs detected by Pcluster: the mir-399 RNA, vertebrate telomase RNA (vert-TR), bacterial transfer-messenger RNA (tmRNA), and the signal recognition particle (SRP) RNA. The general use of the method is illustrated by the ability to accommodate pseudoknots and handle even large and divergent RNA families. The open architecture of the SARSE editor makes it a flexible tool to improve all RNA alignments with relatively little human intervention. Online documentation and software are available at (http://sarse.ku.dk).
我们开发了一种半自动RNA序列编辑器(SARSE),它集成了用于分析RNA比对的工具。该编辑器通过颜色突出显示比对的不同属性,其集成的分析工具可防止在进行比对编辑时引入错误。SARSE可轻松连接到外部工具,以提供灵活的半自动编辑环境。我们引入了一种新方法Pcluster,用于将RNA比对的序列划分为具有二级结构差异的子组。Pcluster用于评估从Rfam数据库获得的574个种子比对,我们鉴定出71个具有显著不一致碱基对预测的比对和102个具有显著新碱基对预测的比对。使用四个RNA家族来说明如何使用SARSE手动或自动校正Pcluster检测到的不一致碱基对:mir-399 RNA、脊椎动物端粒酶RNA(vert-TR)、细菌转移信使RNA(tmRNA)和信号识别颗粒(SRP)RNA。该方法的普遍用途体现在能够容纳假结并处理甚至是大型且多样化的RNA家族。SARSE编辑器的开放式架构使其成为一种灵活的工具,只需较少的人工干预就能改进所有RNA比对。在线文档和软件可在(http://sarse.ku.dk)获取。