Zhang Z, Pearson W R, Miller W
Department of Computer Science and Engineering, The Pennsylvania State University, University Park 16802, USA.
J Comput Biol. 1997 Fall;4(3):339-49. doi: 10.1089/cmb.1997.4.339.
We develop several algorithms for the problem of aligning DNA sequence with a protein sequence. Our methods account for frameshift errors, but not for introns in the DNA sequence. Thus, they are particularly appropriate for comparing a cDNA sequence that suffers from sequencing errors with an amino acid sequence or a protein sequence database. We describe algorithms for computing optimal alignments for several definitions of DNA-protein alignment, verify sufficient conditions for equivalence of certain definitions, describe techniques for efficient implementation, and discuss experience with these ideas in a new release of the FASTA suite of database-searching programs.
我们针对将DNA序列与蛋白质序列进行比对的问题开发了几种算法。我们的方法考虑了移码错误,但未考虑DNA序列中的内含子。因此,它们特别适用于将存在测序错误的cDNA序列与氨基酸序列或蛋白质序列数据库进行比较。我们描述了针对几种DNA-蛋白质比对定义计算最优比对的算法,验证了某些定义等价的充分条件,描述了高效实现的技术,并在数据库搜索程序FASTA套件的新版本中讨论了这些想法的实践经验。