Center for Bioinformatics, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.
Department of Computer Science, University of Toronto, ON M5S 3G4, Canada Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON M5G 1L7, Canada Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 1L7, Canada.
Bioinformatics. 2016 Nov 1;32(21):3224-3232. doi: 10.1093/bioinformatics/btw371. Epub 2016 Jul 4.
As high-throughput sequencing (HTS) technology becomes ubiquitous and the volume of data continues to rise, HTS read alignment is becoming increasingly rate-limiting, which keeps pressing the development of novel read alignment approaches. Moreover, promising novel applications of HTS technology require aligning reads to multiple genomes instead of a single reference; however, it is still not viable for the state-of-the-art aligners to align large numbers of reads to multiple genomes.
We propose de Bruijn Graph-based Aligner (deBGA), an innovative graph-based seed-and-extension algorithm to align HTS reads to a reference genome that is organized and indexed using a de Bruijn graph. With its well-handling of repeats, deBGA is substantially faster than state-of-the-art approaches while maintaining similar or higher sensitivity and accuracy. This makes it particularly well-suited to handle the rapidly growing volumes of sequencing data. Furthermore, it provides a promising solution for aligning reads to multiple genomes and graph-based references in HTS applications.
deBGA is available at: https://github.com/hitbc/deBGA CONTACT: ydwang@hit.edu.cnSupplementary information: Supplementary data are available at Bioinformatics online.
随着高通量测序(HTS)技术的普及和数据量的持续增长,HTS 读取比对变得越来越受限制,这不断推动着新型读取比对方法的发展。此外,HTS 技术有前途的新应用需要将读取与多个基因组进行比对,而不是与单个参考基因组进行比对;但是,对于最先进的比对器来说,将大量读取与多个基因组进行比对仍然是不可行的。
我们提出了基于 de Bruijn 图的比对器(deBGA),这是一种基于图的新颖种子和扩展算法,用于将 HTS 读取与使用 de Bruijn 图组织和索引的参考基因组进行比对。deBGA 很好地处理了重复序列,因此比最先进的方法快得多,同时保持相似或更高的灵敏度和准确性。这使得它特别适合处理快速增长的测序数据量。此外,它为 HTS 应用中读取与多个基因组和基于图的参考进行比对提供了有前途的解决方案。
deBGA 可在以下网址获得:https://github.com/hitbc/deBGA
补充数据可在 Bioinformatics 在线获得。