用于 HMMER 中序列搜索的无高速缓存感知并行 SIMD Viterbi 解码。

Cache-Oblivious parallel SIMD Viterbi decoding for sequence search in HMMER.

机构信息

Instituto Superior Técnico, Universidade de Lisboa, Av, Rovisco Pais, 1049-001 Lisboa, Portugal.

出版信息

BMC Bioinformatics. 2014 May 30;15:165. doi: 10.1186/1471-2105-15-165.

DOI:10.1186/1471-2105-15-165

PMID:24884826

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4229909/

Abstract

BACKGROUND

HMMER is a commonly used bioinformatics tool based on Hidden Markov Models (HMMs) to analyze and process biological sequences. One of its main homology engines is based on the Viterbi decoding algorithm, which was already highly parallelized and optimized using Farrar's striped processing pattern with Intel SSE2 instruction set extension.

RESULTS

A new SIMD vectorization of the Viterbi decoding algorithm is proposed, based on an SSE2 inter-task parallelization approach similar to the DNA alignment algorithm proposed by Rognes. Besides this alternative vectorization scheme, the proposed implementation also introduces a new partitioning of the Markov model that allows a significantly more efficient exploitation of the cache locality. Such optimization, together with an improved loading of the emission scores, allows the achievement of a constant processing throughput, regardless of the innermost-cache size and of the dimension of the considered model.

CONCLUSIONS

The proposed optimized vectorization of the Viterbi decoding algorithm was extensively evaluated and compared with the HMMER3 decoder to process DNA and protein datasets, proving to be a rather competitive alternative implementation. Being always faster than the already highly optimized ViterbiFilter implementation of HMMER3, the proposed Cache-Oblivious Parallel SIMD Viterbi (COPS) implementation provides a constant throughput and offers a processing speedup as high as two times faster, depending on the model's size.

摘要

背景

HMMER 是一种常用的生物信息学工具，基于隐马尔可夫模型（HMMs）来分析和处理生物序列。其主要同源引擎之一基于维特比解码算法，该算法已经使用 Farrar 的条纹处理模式和 Intel SSE2 指令集扩展进行了高度并行化和优化。

结果

提出了一种新的 SIMD 向量化维特比解码算法，基于类似于 Rognes 提出的 DNA 比对算法的 SSE2 任务间并行化方法。除了这种替代向量化方案外，所提出的实现还引入了一种新的 Markov 模型划分，允许更有效地利用缓存局部性。这种优化以及发射分数的改进加载允许实现恒定的处理吞吐量，而与最内层缓存大小和所考虑模型的维度无关。

结论

对维特比解码算法的优化向量化进行了广泛评估，并与 HMMER3 解码器一起用于处理 DNA 和蛋白质数据集，证明是一种相当有竞争力的替代实现。由于始终比 HMMER3 的高度优化的维特比滤波器实现更快，所提出的无缓存感知并行 SIMD 维特比（COPS）实现提供了恒定的吞吐量，并根据模型的大小提供高达两倍的处理加速。