Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania 15219, USA.
RNA. 2020 May;26(5):531-540. doi: 10.1261/rna.073015.119. Epub 2020 Jan 31.
The importance of noncoding RNA sequences has become increasingly clear over the past decade. New RNA families are often detected and analyzed using comparative methods based on multiple sequence alignments. Accordingly, a number of programs have been developed for aligning and deriving secondary structures from sets of RNA sequences. Yet, the best tools for these tasks remain unclear because existing benchmarks contain too few sequences belonging to only a small number of RNA families. RNAconTest (RNA consistency test) is a new benchmarking approach relying on the observation that secondary structure is often conserved across highly divergent RNA sequences from the same family. RNAconTest scores multiple sequence alignments based on the level of consistency among known secondary structures belonging to reference sequences in their output alignment. Similarly, consensus secondary structure predictions are scored according to their agreement with one or more known structures in a family. Comparing the performance of 10 popular alignment programs using RNAconTest revealed that DAFS, DECIPHER, LocARNA, and MAFFT created the most structurally consistent alignments. The best consensus secondary structure predictions were generated by DAFS and LocARNA (via RNAalifold). Many of the methods specific to noncoding RNAs exhibited poor scalability as the number or length of input sequences increased, and several programs displayed substantial declines in score as more sequences were aligned. Overall, RNAconTest provides a means of testing and improving tools for comparative RNA analysis, as well as highlighting the best available approaches. RNAconTest is available from the DECIPHER website (http://DECIPHER.codes/Downloads.html).
在过去的十年中,非编码 RNA 序列的重要性变得越来越明显。新的 RNA 家族通常使用基于多序列比对的比较方法来检测和分析。因此,已经开发了许多用于对齐和从 RNA 序列集中推导出二级结构的程序。然而,这些任务的最佳工具仍不清楚,因为现有的基准测试包含的序列太少,而且只属于少数几个 RNA 家族。RNAconTest(RNA 一致性测试)是一种新的基准测试方法,它依赖于这样一个观察结果,即二级结构在同一家族的高度不同的 RNA 序列中通常是保守的。RNAconTest 根据其在输出比对中所属的参考序列的已知二级结构之间的一致性水平来对多重序列比对进行评分。类似地,根据其与家族中一个或多个已知结构的一致性来对共识二级结构预测进行评分。使用 RNAconTest 比较了 10 种流行的对齐程序的性能,结果表明 DAFS、DECIPHER、LocARNA 和 MAFFT 创建了最具结构一致性的对齐。DAFS 和 LocARNA(通过 RNAalifold)生成的最佳共识二级结构预测。许多专门针对非编码 RNA 的方法随着输入序列数量或长度的增加而表现出较差的可扩展性,并且当更多序列被对齐时,几个程序的得分会大幅下降。总体而言,RNAconTest 提供了一种测试和改进比较 RNA 分析工具的方法,并突出了最佳可用方法。RNAconTest 可从 DECIPHER 网站(http://DECIPHER.codes/Downloads.html)获得。