生物信息学的无缓存动态编程。

Cache-oblivious dynamic programming for bioinformatics.

机构信息

Center for Computational Visualization, Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX 78712-0027, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2010 Jul-Sep;7(3):495-510. doi: 10.1109/TCBB.2008.94.

DOI:10.1109/TCBB.2008.94

PMID:20671320

Abstract

We present efficient cache-oblivious algorithms for some well-studied string problems in bioinformatics including the longest common subsequence, global pairwise sequence alignment and three-way sequence alignment (or median), both with affine gap costs, and RNA secondary structure prediction with simple pseudoknots. For each of these problems, we present cache-oblivious algorithms that match the best-known time complexity, match or improve the best-known space complexity, and improve significantly over the cache-efficiency of earlier algorithms. We present experimental results which show that our cache-oblivious algorithms run faster than software and implementations based on previous best algorithms for these problems.

摘要

我们提出了一些高效的、针对生物信息学中一些研究成熟的字符串问题的、基于缓存的算法，包括最长公共子序列、全局两两序列比对和三序列比对（或中位数），都带有仿射间隙代价，以及具有简单假结的 RNA 二级结构预测。对于这些问题中的每一个，我们都提出了基于缓存的算法，其时间复杂度与已知的最优复杂度相匹配，或者在空间复杂度上相匹配或有所改进，并且显著优于之前算法的缓存效率。我们给出了实验结果，这些结果表明我们的基于缓存的算法比针对这些问题的先前最佳算法的软件和实现运行得更快。