Hall J D, Myers E W
Department of Molecular and Cellular Biology, University of Arizona, Tucson 85721.
Comput Appl Biosci. 1988 Mar;4(1):35-40. doi: 10.1093/bioinformatics/4.1.35.
We describe software for aligning protein or nucleic acid sequences based on the concept of match density. This method is especially useful for locating regions of short similarity between two longer sequences which may be largely dissimilar (e.g. locating active site regions in distantly related proteins). Our software is able to identify biologically interesting similarities between two sub-regions because it allows the user to control the matching parameters and the manner in which local alignments are selected for display. Furthermore, the collection and ranking of alignments for display uses a novel, highly efficient algorithm. We illustrate these features with several examples. In addition, we show that this tool can be used to find a new conserved sequence in several viral DNA polymerases, which, we suggest, occurs at a functionally important enzymatic site.
我们描述了一种基于匹配密度概念来比对蛋白质或核酸序列的软件。这种方法对于定位两条较长序列之间短相似度区域特别有用,这两条序列可能在很大程度上不相似(例如,在远缘相关蛋白质中定位活性位点区域)。我们的软件能够识别两个子区域之间具有生物学意义的相似性,因为它允许用户控制匹配参数以及选择用于显示的局部比对的方式。此外,用于显示的比对的收集和排序使用了一种新颖、高效的算法。我们用几个例子说明了这些特征。此外,我们表明该工具可用于在几种病毒DNA聚合酶中找到一个新的保守序列,我们认为该序列出现在一个功能重要的酶位点。