Li W H
J Mol Evol. 1993 Jan;36(1):96-9. doi: 10.1007/BF02407308.
The current convention in estimating the number of substitutions per synonymous site (KS) and per nonsynonymous site (KA) between two protein-coding genes is to count each twofold degenerate site as one-third synonymous and two-thirds nonsynonymous because one of the three possible changes at such a site is synonymous and the other two are nonsynonymous. This counting rule can considerably overestimate the KS value because transitional mutations tend to occur more often than transversional mutations and because most transitional mutations at twofold degenerate sites are synonymous. A new method that gives unbiased estimates is proposed. An application of the new and the old method to 14 pairs of mouse and rat genes shows that the new method gives a KS value very close to the number of substitutions per four-fold degenerate site whereas the old method gives a value 30% higher. Both methods give a KA value close to the number of substitutions per nondegenerate site.
当前估算两个蛋白质编码基因之间每个同义位点(KS)和每个非同义位点(KA)替换数的惯例是,将每个双重简并位点计为三分之一同义位点和三分之二非同义位点,因为在这样一个位点的三种可能变化中,有一种是同义的,另外两种是非同义的。这种计数规则可能会大幅高估KS值,原因是转换突变往往比颠换突变更频繁地发生,而且双重简并位点处的大多数转换突变是同义的。本文提出了一种能给出无偏估计的新方法。将新旧方法应用于14对小鼠和大鼠基因,结果表明新方法给出的KS值非常接近每个四重简并位点的替换数,而旧方法给出的值要高30%。两种方法给出的KA值都接近每个非简并位点的替换数。