Suppr超能文献

STRAL:利用碱基配对概率向量在二次时间内对非编码RNA进行渐进比对。

STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time.

作者信息

Dalli Deniz, Wilm Andreas, Mainz Indra, Steger Gerhard

机构信息

Heinrich-Heine-Universität Düsseldorf, Institut für Physikalische Biologie D-40225 Düsseldorf, Germany.

出版信息

Bioinformatics. 2006 Jul 1;22(13):1593-9. doi: 10.1093/bioinformatics/btl142. Epub 2006 Apr 13.

Abstract

MOTIVATION

Alignment of RNA has a wide range of applications, for example in phylogeny inference, consensus structure prediction and homology searches. Yet aligning structural or non-coding RNAs (ncRNAs) correctly is notoriously difficult as these RNA sequences may evolve by compensatory mutations, which maintain base pairing but destroy sequence homology. Ideally, alignment programs would take RNA structure into account. The Sankoff algorithm for the simultaneous solution of RNA structure prediction and RNA sequence alignment was proposed 20 years ago but suffers from its exponential complexity. A number of programs implement lightweight versions of the Sankoff algorithm by restricting its application to a limited type of structure and/or only pairwise alignment. Thus, despite recent advances, the proper alignment of multiple structural RNA sequences remains a problem.

RESULTS

Here we present StrAl, a heuristic method for alignment of ncRNA that reduces sequence-structure alignment to a two-dimensional problem similar to standard multiple sequence alignment. The scoring function takes into account sequence similarity as well as up- and downstream pairing probability. To test the robustness of the algorithm and the performance of the program, we scored alignments produced by StrAl against a large set of published reference alignments. The quality of alignments predicted by StrAl is far better than that obtained by standard sequence alignment programs, especially when sequence homologies drop below approximately 65%; nevertheless StrAl's runtime is comparable to that of ClustalW.

摘要

动机

RNA比对具有广泛的应用,例如在系统发育推断、共有结构预测和同源性搜索中。然而,正确比对结构RNA或非编码RNA(ncRNA)非常困难,因为这些RNA序列可能通过补偿性突变进化,这种突变维持碱基配对但破坏序列同源性。理想情况下,比对程序应考虑RNA结构。20年前就提出了用于同时解决RNA结构预测和RNA序列比对的 Sankoff算法,但它具有指数级复杂度。许多程序通过将其应用限制于有限类型的结构和/或仅进行成对比对来实现Sankoff算法的轻量级版本。因此,尽管有最近的进展,但多个结构RNA序列的正确比对仍然是一个问题。

结果

在此我们展示了StrAl,一种用于ncRNA比对的启发式方法,它将序列 - 结构比对简化为类似于标准多序列比对的二维问题。评分函数考虑了序列相似性以及上下游配对概率。为了测试算法的稳健性和程序的性能,我们将StrAl产生的比对与大量已发表的参考比对进行评分比较。StrAl预测的比对质量远优于标准序列比对程序获得的质量,特别是当序列同源性降至约65%以下时;不过StrAl的运行时间与ClustalW相当。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验