Bellgard Matthew, Gamble Thomas, Reynolds Mark, Hunter Adam, Trifonov Ed, Taplin Ross
Centre for Bioinformatics and Biological Computing, School of Information Technology, Murdoch University, WA, Australia.
Appl Bioinformatics. 2003;2(3 Suppl):S31-5.
Pairwise sequence alignment is one of the most essential tools in comparative genomic sequence analysis. It is used to compare the sequences of genes and proteins with the aim of inferring structural, functional and evolutionary relationships. However, current 'mainstream' alignment algorithms have optimisation criteria based primarily on computational efficiency using parameters such as gap penalties, which are not biologically motivated. In addition, current alignment algorithms such as the Smith and Waterman technique provide a single alignment that could be sensitive to rather arbitrary choices in parameters such as gap penalties. This paper explores the range of properties resulting from posing the alignment problem more as a 'mapping gaps in sequences' exercise. We argue that this approach is intuitive and provides greater control over the number of gaps placed within an alignment. This type of approach was proposed by Sankoff (1972), but unfortunately has not received much attention. We report and discuss our findings by comparing this approach to other techniques using structurally confirmed aligned sequences from a benchmark alignment database. Interestingly, this approach consistently provides optimal and near optimal alignments and is thus a viable approach to sequence alignment.
两两序列比对是比较基因组序列分析中最基本的工具之一。它用于比较基因和蛋白质的序列,目的是推断结构、功能和进化关系。然而,当前的“主流”比对算法的优化标准主要基于使用诸如空位罚分等参数的计算效率,而这些参数并没有生物学依据。此外,当前的比对算法,如史密斯-沃特曼技术,提供的单一比对可能对诸如空位罚分等参数中的相当随意的选择敏感。本文探讨了将比对问题更多地作为“映射序列中的空位”练习所产生的一系列特性。我们认为这种方法直观,并且能更好地控制比对中放置的空位数量。这种方法是桑科夫在1972年提出的,但遗憾的是没有得到太多关注。我们通过将这种方法与使用来自基准比对数据库的结构确认的比对序列的其他技术进行比较来报告和讨论我们的发现。有趣的是,这种方法始终提供最优和接近最优的比对,因此是一种可行的序列比对方法。