Thompson J D, Higgins D G, Gibson T J
European Molecular Biology Laboratory, Heidelberg, Germany.
Nucleic Acids Res. 1994 Nov 11;22(22):4673-80. doi: 10.1093/nar/22.22.4673.
The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.
对于差异较大的蛋白质序列比对,常用的渐进式多序列比对方法的灵敏度已得到极大提高。首先,在局部比对中为每个序列赋予单独的权重,以便降低近乎重复序列的权重,并提高差异最大序列的权重。其次,根据待比对序列的差异程度,在不同的比对阶段使用不同的氨基酸替换矩阵。第三,残基特异性空位罚分以及亲水区域局部降低的空位罚分,促使在潜在环区形成新的空位,而非在规则二级结构中形成。第四,早期比对中已打开空位的位置,其空位罚分局部降低,以鼓励在这些位置形成新的空位。这些修改被整合到一个新程序CLUSTAL W中,该程序可免费获取。