HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Hong Kong, China.
PLoS One. 2013 May 31;8(5):e65632. doi: 10.1371/journal.pone.0065632. Print 2013.
To tackle the exponentially increasing throughput of Next-Generation Sequencing (NGS), most of the existing short-read aligners can be configured to favor speed in trade of accuracy and sensitivity. SOAP3-dp, through leveraging the computational power of both CPU and GPU with optimized algorithms, delivers high speed and sensitivity simultaneously. Compared with widely adopted aligners including BWA, Bowtie2, SeqAlto, CUSHAW2, GEM and GPU-based aligners BarraCUDA and CUSHAW, SOAP3-dp was found to be two to tens of times faster, while maintaining the highest sensitivity and lowest false discovery rate (FDR) on Illumina reads with different lengths. Transcending its predecessor SOAP3, which does not allow gapped alignment, SOAP3-dp by default tolerates alignment similarity as low as 60%. Real data evaluation using human genome demonstrates SOAP3-dp's power to enable more authentic variants and longer Indels to be discovered. Fosmid sequencing shows a 9.1% FDR on newly discovered deletions. SOAP3-dp natively supports BAM file format and provides the same scoring scheme as BWA, which enables it to be integrated into existing analysis pipelines. SOAP3-dp has been deployed on Amazon-EC2, NIH-Biowulf and Tianhe-1A.
为了解决下一代测序(NGS)数据呈指数级增长的问题,大多数现有的短读序列比对软件都可以通过牺牲准确性和敏感性来提高速度。SOAP3-dp 通过利用 CPU 和 GPU 的计算能力和优化算法,实现了速度和敏感性的同时提高。与广泛使用的比对软件(如 BWA、Bowtie2、SeqAlto、CUSHAW2、GEM 和基于 GPU 的 BarraCUDA 和 CUSHAW 等)相比,SOAP3-dp 的速度快了 2 到 10 倍,同时保持了对不同长度的 Illumina 读段的最高敏感性和最低假阳性率(FDR)。与不允许有缺口的 SOAP3 相比,SOAP3-dp 默认情况下可以容忍低至 60%的比对相似度。使用人类基因组的真实数据评估表明,SOAP3-dp 可以发现更多真实的变异和更长的插入缺失。fosmid 测序在新发现的缺失上的 FDR 为 9.1%。SOAP3-dp 原生支持 BAM 文件格式,并提供与 BWA 相同的评分方案,这使其能够集成到现有的分析管道中。SOAP3-dp 已经部署在 Amazon-EC2、NIH-Biowulf 和 Tianhe-1A 上。