Zhang Z, Schwartz S, Wagner L, Miller W
Department of Computer Science and Engineering, The Pennsylvania State University, University Park 16802, USA.
J Comput Biol. 2000 Feb-Apr;7(1-2):203-14. doi: 10.1089/10665270050081478.
For aligning DNA sequences that differ only by sequencing errors, or by equivalent errors from other sources, a greedy algorithm can be much faster than traditional dynamic programming approaches and yet produce an alignment that is guaranteed to be theoretically optimal. We introduce a new greedy alignment algorithm with particularly good performance and show that it computes the same alignment as does a certain dynamic programming algorithm, while executing over 10 times faster on appropriate data. An implementation of this algorithm is currently used in a program that assembles the UniGene database at the National Center for Biotechnology Information.
对于仅因测序错误或其他来源的等效错误而不同的DNA序列进行比对时,贪心算法可能比传统的动态规划方法快得多,并且能产生理论上保证最优的比对结果。我们引入了一种具有特别良好性能的新贪心比对算法,并表明它与特定的动态规划算法计算出相同的比对结果,同时在适当的数据上执行速度快10倍以上。该算法的一个实现目前用于美国国立生物技术信息中心组装UniGene数据库的程序中。