School of Computer Engineering, Nanyang Technological University, Singapore.
Bioinformatics. 2010 May 15;26(10):1368-9. doi: 10.1093/bioinformatics/btq135. Epub 2010 Mar 26.
Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of multi-core technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications. In this article, we introduce an implementation that accelerates the distance matrix computation on x86 and Cell Broadband Engine, a homogeneous and heterogeneous multi-core system, respectively. By taking advantage of multiple processors as well as Single Instruction Multiple Data vectorization, we were able to achieve speed-ups of two orders of magnitude compared to the publicly available implementation utilized in ClustalW.
Source codes in C are publicly available at https://sourceforge.net/projects/distmatcomp/
多序列比对是生物信息学中的重要工具。虽然针对该问题已存在有效的启发式算法,但生物数据的指数级增长要求更高的吞吐量。多核技术的最新发展使得许多生物信息学应用的执行时间得到了极大的提高。在本文中,我们分别在 x86 和 Cell Broadband Engine 上实现了一个加速距离矩阵计算的方法,Cell Broadband Engine 是一个同构和异构多核系统。通过利用多个处理器和单指令多数据流矢量化,我们与 ClustalW 中使用的公开实现相比,实现了两个数量级的加速。
C 语言的源代码可在 https://sourceforge.net/projects/distmatcomp/ 上获得。