Benner S A, Cohen M A, Gonnet G H
Institute for Organic Chemistry, Swiss Federal Institute of Technology, Zurich.
Protein Eng. 1994 Nov;7(11):1323-32. doi: 10.1093/protein/7.11.1323.
In aligning homologous protein sequences, it is generally assumed that amino acid substitutions subsequent in time occur independently of amino acid substitutions previous in time, i.e. that patterns of mutation are similar at low and high sequence divergence. This assumption is examined here and shown to be incorrect in an interesting way. Separate mutation matrices were constructed for aligned protein sequence pairs at divergences ranging from 5 to 100 PAM units (point accepted mutations per 100 aligned positions). From these, the corresponding log-odds (Dayhoff) matrices, normalized to 250 PAM units, were constructed. The matrices show that the genetic code influences accepted point mutations strongly at early stages of divergence, while the chemical properties of the side chains dominate at more advanced stages.
在比对同源蛋白质序列时,通常假定时间上较晚发生的氨基酸替换独立于时间上较早发生的氨基酸替换,即低序列分歧和高序列分歧时的突变模式相似。本文对这一假定进行了检验,并以一种有趣的方式表明它是不正确的。针对分歧度在5至100个PAM单位(每100个比对位置上的接受点突变)范围内的比对蛋白质序列对构建了单独的突变矩阵。据此构建了归一化为250个PAM单位的相应对数似然(Dayhoff)矩阵。这些矩阵表明遗传密码在分歧的早期阶段对接受点突变有强烈影响,而在更后期阶段侧链的化学性质起主导作用。