Retamosa Germán, de Pedro Luis, González Ivan, Tamames Javier
High Performance Computing and Networking Department, Universidad Autonóma de Madrid, Madrid, Spain.
National Center for Biotechnology, CSIC, Madrid, Spain.
Evol Bioinform Online. 2016 Dec 18;12:313-322. doi: 10.4137/EBO.S40877. eCollection 2016.
Homology detection has evolved over the time from heavy algorithms based on dynamic programming approaches to lightweight alternatives based on different heuristic models. However, the main problem with these algorithms is that they use complex statistical models, which makes it difficult to achieve a relevant speedup and find exact matches with the original results. Thus, their acceleration is essential. The aim of this article was to prefilter a sequence database. To make this work, we have implemented a groundbreaking heuristic model based on NVIDIA's graphics processing units (GPUs) and multicore processors. Depending on the sensitivity settings, this makes it possible to quickly reduce the sequence database by factors between 50% and 95%, while rejecting no significant sequences. Furthermore, this prefiltering application can be used together with multiple homology detection algorithms as a part of a next-generation sequencing system. Extensive performance and accuracy tests have been carried out in the Spanish National Centre for Biotechnology (NCB). The results show that GPU hardware can accelerate the execution times of former homology detection applications, such as National Centre for Biotechnology Information (NCBI), Basic Local Alignment Search Tool for Proteins (BLASTP), up to a factor of 4.
同源性检测随着时间的推移,已经从基于动态规划方法的复杂算法发展到基于不同启发式模型的轻量级替代方法。然而,这些算法的主要问题在于它们使用复杂的统计模型,这使得难以实现显著的加速,也难以找到与原始结果完全匹配的结果。因此,对它们进行加速至关重要。本文的目的是对序列数据库进行预过滤。为实现这一目标,我们基于英伟达的图形处理单元(GPU)和多核处理器实现了一种开创性的启发式模型。根据灵敏度设置,这使得能够快速将序列数据库减少50%至95%,同时不会遗漏任何重要序列。此外,这种预过滤应用程序可与多种同源性检测算法一起作为下一代测序系统的一部分使用。西班牙国家生物技术中心(NCB)已进行了广泛的性能和准确性测试。结果表明,GPU硬件可将诸如美国国立生物技术信息中心(NCBI)、蛋白质基本局部比对搜索工具(BLASTP)等先前同源性检测应用程序的执行时间加速4倍。