Gudyś Adam, Deorowicz Sebastian
Institute of Informatics, Faculty of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland.
PLoS One. 2014 Feb 25;9(2):e88901. doi: 10.1371/journal.pone.0088901. eCollection 2014.
Multiple sequence alignment is a crucial task in a number of biological analyses like secondary structure prediction, domain searching, phylogeny, etc. MSAProbs is currently the most accurate alignment algorithm, but its effectiveness is obtained at the expense of computational time. In the paper we present QuickProbs, the variant of MSAProbs customised for graphics processors. We selected the two most time consuming stages of MSAProbs to be redesigned for GPU execution: the posterior matrices calculation and the consistency transformation. Experiments on three popular benchmarks (BAliBASE, PREFAB, OXBench-X) on quad-core PC equipped with high-end graphics card show QuickProbs to be 5.7 to 9.7 times faster than original CPU-parallel MSAProbs. Additional tests performed on several protein families from Pfam database give overall speed-up of 6.7. Compared to other algorithms like MAFFT, MUSCLE, or ClustalW, QuickProbs proved to be much more accurate at similar speed. Additionally we introduce a tuned variant of QuickProbs which is significantly more accurate on sets of distantly related sequences than MSAProbs without exceeding its computation time. The GPU part of QuickProbs was implemented in OpenCL, thus the package is suitable for graphics processors produced by all major vendors.
多序列比对是二级结构预测、结构域搜索、系统发育等众多生物学分析中的一项关键任务。MSAProbs是目前最精确的比对算法,但其有效性是以计算时间为代价获得的。在本文中,我们提出了QuickProbs,这是一种针对图形处理器定制的MSAProbs变体。我们选择了MSAProbs中两个最耗时的阶段进行重新设计,以便在GPU上执行:后验矩阵计算和一致性变换。在配备高端显卡的四核PC上对三个流行基准(BAliBASE、PREFAB、OXBench-X)进行的实验表明,QuickProbs比原始的CPU并行MSAProbs快5.7至9.7倍。对Pfam数据库中的几个蛋白质家族进行的额外测试给出了6.7的总体加速比。与MAFFT、MUSCLE或ClustalW等其他算法相比,QuickProbs在相似速度下被证明更精确。此外,我们引入了一种经过调整的QuickProbs变体,在不超过其计算时间的情况下,对于远缘相关序列集,它比MSAProbs精确得多。QuickProbs的GPU部分是用OpenCL实现的,因此该软件包适用于所有主要供应商生产的图形处理器。