Warris Sven, Yalcin Feyruz, Jackson Katherine J L, Nap Jan Peter
Institute for Life Science & Technology & Hanze Research Center Energy, Hanze University of Applied Sciences Groningen, 9747 AS, Zernikeplein 11, Groningen, The Netherlands.
KeyGene N. V., 6708 PW, Agro Business Park 90, Wageningen, The Netherlands.
PLoS One. 2015 Apr 1;10(4):e0122524. doi: 10.1371/journal.pone.0122524. eCollection 2015.
To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis.
With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
以快速且灵活的方式获得大规模序列比对是下一代测序数据分析中的重要一步。基于史密斯-沃特曼(SW)算法的应用程序通常速度不够快,仅限于特定任务,或者由于统计问题而不够准确。当前在图形硬件上运行的SW实现并未报告进一步分析所需的比对细节。
使用并行SW比对软件(PaSWAS),可以(a)轻松利用基于NVIDIA的通用图形处理单元(GPGPU)的计算能力来执行高速序列比对,以及(b)检索诸如得分、缺口数量和错配数量等相关信息。该软件会报告每次比对的多个命中结果。通过两个测试用例展示了新的SW实现的附加价值:(1)下一代序列数据中的标签恢复,以及(2)免疫球蛋白454序列数据集中的同种型分配。这两个案例均展示了新的并行史密斯-沃特曼实现的可用性和通用性。