Chan S C, Wong A K, Chiu D K
Department of Systems Design Engineering, University of Waterloo, Canada.
Bull Math Biol. 1992 Jul;54(4):563-98. doi: 10.1007/BF02459635.
Multiple sequence comparison refers to the search for similarity in three or more sequences. This article presents a survey of the exhaustive (optimal) and heuristic (possibly sub-optimal) methods developed for the comparison of multiple macromolecular sequences. Emphasis is given to the different approaches of the heuristic methods. Four distance measures derived from information engineering and genetic studies are introduced for the comparison between two alignments of sequences. The use of entropy, which plays a central role in information theory as measures of information, choice and uncertainty, is proposed as a simple measure for the evaluation of the optimality of an alignment in the absence of any a priori knowledge about the structures of the sequences being compared. This article also gives two examples of comparison between alternative alignments of the same set of 5SRNAs as obtained by several different heuristic methods.
多序列比对是指在三条或更多序列中寻找相似性。本文综述了为多聚大分子序列比对而开发的穷举(最优)法和启发式(可能次优)法。重点介绍了启发式方法的不同途径。引入了四种源自信息工程和遗传学研究的距离度量,用于比较两个序列比对。在没有关于被比较序列结构的任何先验知识的情况下,建议使用在信息论中作为信息、选择和不确定性度量而发挥核心作用的熵,作为评估比对最优性的一种简单度量。本文还给出了通过几种不同启发式方法获得的同一组5S核糖体RNA的替代比对之间的比较示例。