Moore G W, Goodman M
J Mol Evol. 1977 Apr 29;9(2):121-30. doi: 10.1007/BF01732744.
Closely related proteins show an obvious kinship by having numerous matching amino acids in their aligned sequences. Kinship between anciently separated proteins requires a statistical evaluation to rule out fortuitous similarities. A simple statistic is developed which assumes equal probability for all codon pairs, and a table of critical values for amino acid sequence alignments of lengthnments of length 200 or less is presented. Applying this statistic to V and C regions of immunoglobulin chains, aligned on the basis of shared features of three-dimensional structure, provides evidence that the V and C sequences descended from a common ancestor. Similarly the distant evolutionary relationship of dehydrogenases, flavdoxin, and subtilisin, suggested by structural alignments, is verified. On the other hand, the statistic does not verify a common evolutionary origin for the heme binding pocket in globins and cytochrome bs. Empirical evidence from the distribution of MMD values of amino acid pairs in comparisons of misaligned polypeptide chains and from Monte Carlo trials of sequences aligned with arbitrary gaps supports the validity of the statistic.
亲缘关系相近的蛋白质在其比对序列中具有大量匹配的氨基酸,从而呈现出明显的亲缘性。古老分化的蛋白质之间的亲缘关系需要进行统计学评估,以排除偶然的相似性。我们开发了一种简单的统计方法,该方法假定所有密码子对具有相等的概率,并给出了长度为200或更短的氨基酸序列比对的临界值表。将此统计方法应用于基于三维结构共享特征比对的免疫球蛋白链的V区和C区,提供了V区和C区序列源自共同祖先的证据。同样,结构比对所表明的脱氢酶、黄素氧还蛋白和枯草杆菌蛋白酶之间遥远的进化关系也得到了验证。另一方面,该统计方法并未证实球蛋白和细胞色素b中的血红素结合口袋具有共同的进化起源。来自未比对多肽链比较中氨基酸对MMD值分布的经验证据以及来自具有任意缺口的比对序列的蒙特卡罗试验,都支持了该统计方法的有效性。