Cornish-Bowden A
Biochem J. 1980 Nov 1;191(2):349-54. doi: 10.1042/bj1910349.
Because evolution occurs by random events, the actual number of substitutions that occur in any period is not exactly equal to the number expected from the mean rate of substitution, but is statistically distributed about it. In consequence, even if rates of evolution are constant in different lineages, 'trees' deduced from descendant protein sequences contain random errors. When there are fewer than about eight differences between the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based the sequences of the most distantly related pair from a set of proteins, this random effect is very large. It can then render trivial the statistical disadvantage inherent in using a crude measure of protein difference, such as amino acid composition or immunological cross-reactivity, in preference to a measure based on amino acid sequence. In some cases, such as classification of mammals on the basis of cytochrome c structure, it appears to make little difference to the reliability of the results whether the sequences of the protein concerned are known or not. It may also be possible to obtain more reliable phylogenetic information from composition measurements on several kinds of protein than one could obtain from sequence measurements on a single kind of protein.
由于进化是由随机事件发生的,所以在任何时期发生的实际替换数并不恰好等于根据平均替换率预期的数量,而是围绕该预期数量呈统计分布。因此,即使不同谱系中的进化速率是恒定的,从后代蛋白质序列推导出来的“树”也包含随机误差。当一组蛋白质中关系最疏远的一对序列之间的差异少于大约八个时,这种随机效应就会非常大。那么,相较于基于氨基酸序列的测量方法,使用诸如氨基酸组成或免疫交叉反应性等粗略的蛋白质差异测量方法所固有的统计劣势就变得微不足道了。在某些情况下,比如根据细胞色素c结构对哺乳动物进行分类,所涉及蛋白质的序列是否已知似乎对结果的可靠性影响不大。从几种蛋白质的组成测量中也有可能获得比从一种蛋白质的序列测量中更可靠的系统发育信息。