Corpet F
Laboratoire de Génétique Cellulaire, INRA Toulouse, France.
Nucleic Acids Res. 1988 Nov 25;16(22):10881-90. doi: 10.1093/nar/16.22.10881.
An algorithm is presented for the multiple alignment of sequences, either proteins or nucleic acids, that is both accurate and easy to use on microcomputers. The approach is based on the conventional dynamic-programming method of pairwise alignment. Initially, a hierarchical clustering of the sequences is performed using the matrix of the pairwise alignment scores. The closest sequences are aligned creating groups of aligned sequences. Then close groups are aligned until all sequences are aligned in one group. The pairwise alignments included in the multiple alignment form a new matrix that is used to produce a hierarchical clustering. If it is different from the first one, iteration of the process can be performed. The method is illustrated by an example: a global alignment of 39 sequences of cytochrome c.
本文提出了一种用于蛋白质或核酸序列多重比对的算法,该算法准确且易于在微型计算机上使用。该方法基于传统的成对比对动态规划方法。首先,使用成对比对得分矩阵对序列进行层次聚类。将最相似的序列进行比对,形成比对序列组。然后将相近的组进行比对,直到所有序列都比对到一个组中。多重比对中包含的成对比对形成一个新矩阵,用于产生层次聚类。如果与第一个矩阵不同,可以进行该过程的迭代。通过一个例子说明了该方法:细胞色素c的39个序列的全局比对。