Feng D F, Doolittle R F
Center for Molecular Genetics, University of California, San Diego, La Jolla, CA 92093-0634, USA.
J Mol Evol. 1997 Apr;44(4):361-70. doi: 10.1007/pl00006155.
Amino acid substitution tables are essential for the proper alignment of protein sequences, and alignment scores based on them can be transformed into distance measures by various means. In the simplest case, the negative log of the score is used. This Poisson relationship assumes that all sites are equally likely to change, however. A more accurate relationship would correct for different rates of change at each residue position. Recently, Grishin (J. Mol. Evol. 41:675-679, 1995) published a set of simple equations that correct for various circumstances, including different rates of change at different sites. We have used these equations in conjunction with similarity scores that take into account constraints on amino acid interchange. Simulation studies show a linear relationship between these calculated distances and the numbers of allowed mutations based on the observed variation of rate at all sites in various proteins.
氨基酸替换表对于蛋白质序列的正确比对至关重要,基于这些表的比对分数可以通过各种方法转化为距离度量。在最简单的情况下,使用分数的负对数。然而,这种泊松关系假设所有位点发生变化的可能性相同。更准确的关系会校正每个残基位置不同的变化率。最近,格里申(《分子进化杂志》41:675 - 679,1995年)发表了一组简单的方程,可校正各种情况,包括不同位点不同的变化率。我们将这些方程与考虑氨基酸互换限制的相似性分数结合使用。模拟研究表明,这些计算出的距离与基于各种蛋白质中所有位点观察到的变化率变化所允许的突变数之间存在线性关系。