Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Bioinformatics. 2019 Sep 1;35(17):2966-2973. doi: 10.1093/bioinformatics/btz033.
Genomic variant detection from next-generation sequencing has become established as an extremely important component of research and clinical diagnoses in both cancer and Mendelian disorders. Insertions and deletions (indels) are a common source of variation and can frequently impact functionality, thus making their detection vitally important. While substantial effort has gone into detecting indels from DNA, there is still opportunity for improvement. Further, detection of indels from RNA-Seq data has largely been an afterthought and offers another critical area for variant detection.
We present here ABRA2, a redesign of the original ABRA implementation that offers support for realignment of both RNA and DNA short reads. The process results in improved accuracy and scalability including support for human whole genomes. Results demonstrate substantial improvement in indel detection for a variety of data types, including those that were not previously supported by ABRA. Further, ABRA2 results in broad improvements to variant calling accuracy across a wide range of post-processing workflows including whole genomes, targeted exomes and transcriptome sequencing.
ABRA2 is implemented in a combination of Java and C/C++ and is freely available to all from: https://github.com/mozack/abra2.
Supplementary data are available at Bioinformatics online.
下一代测序的基因组变异检测已成为癌症和孟德尔疾病研究和临床诊断中极其重要的组成部分。插入和缺失(indels)是变异的常见来源,并且经常影响功能,因此检测它们至关重要。虽然已经投入大量精力从 DNA 中检测 indels,但仍有改进的空间。此外,从 RNA-Seq 数据中检测 indels 在很大程度上被忽视了,这为变异检测提供了另一个关键领域。
我们在这里提出了 ABRA2,这是对原始 ABRA 实现的重新设计,为 RNA 和 DNA 短读段的重新比对提供了支持。该过程可提高准确性和可扩展性,包括对人类全基因组的支持。结果表明,ABRA2 可显著提高各种数据类型的 indel 检测准确性,包括以前不支持 ABRA 的数据类型。此外,ABRA2 还可以广泛提高各种后处理工作流程(包括全基因组、靶向外显子组和转录组测序)的变异调用准确性。
ABRA2 是用 Java 和 C/C++ 组合实现的,所有人都可以从以下网址免费获得:https://github.com/mozack/abra2。
补充数据可在 Bioinformatics 在线获得。