Ahn Jaegyoon, Xiao Xinshu
Department of Integrative Biology and Physiology and the Molecular Biology Institute, University of California Los Angeles, Los Angeles, CA 90095, USA.
Bioinformatics. 2015 Dec 15;31(24):3906-13. doi: 10.1093/bioinformatics/btv505. Epub 2015 Aug 30.
Accurate identification of genetic variants such as single-nucleotide polymorphisms (SNPs) or RNA editing sites from RNA-Seq reads is important, yet challenging, because it necessitates a very low false-positive rate in read mapping. Although many read aligners are available, no single aligner was specifically developed or tested as an effective tool for SNP and RNA editing prediction.
We present RASER, an accurate read aligner with novel mapping schemes and index tree structure that aims to reduce false-positive mappings due to existence of highly similar regions. We demonstrate that RASER shows the best mapping accuracy compared with other popular algorithms and highest sensitivity in identifying multiply mapped reads. As a result, RASER displays superb efficacy in unbiased mapping of the alternative alleles of SNPs and in identification of RNA editing sites.
RASER is written in C++ and freely available for download at https://github.com/jaegyoonahn/RASER.
从RNA测序读数中准确识别单核苷酸多态性(SNP)或RNA编辑位点等基因变异很重要,但也具有挑战性,因为这需要在读取映射中具有非常低的假阳性率。尽管有许多读取比对工具可用,但没有一个比对工具是专门为SNP和RNA编辑预测而开发或测试的有效工具。
我们提出了RASER,这是一种具有新颖映射方案和索引树结构的准确读取比对工具,旨在减少由于高度相似区域的存在而导致的假阳性映射。我们证明,与其他流行算法相比,RASER显示出最佳的映射准确性,并且在识别多重映射读取方面具有最高的灵敏度。因此,RASER在SNP替代等位基因的无偏映射和RNA编辑位点的识别方面显示出卓越的功效。
RASER用C++编写,可在https://github.com/jaegyoonahn/RASER上免费下载。