Welch John J
Centre for the Study of Evolution, School of Life Sciences, University of Sussex, Brighton, BN1 9QG, UK.
Genetics. 2006 Jun;173(2):821-37. doi: 10.1534/genetics.106.056911. Epub 2006 Apr 2.
When polymorphism and divergence data are available for multiple loci, extended forms of the McDonald-Kreitman test can be used to estimate the average proportion of the amino acid divergence due to adaptive evolution--a statistic denoted alpha. But such tests are subject to many biases. Most serious is the possibility that high estimates of alpha reflect demographic changes rather than adaptive substitution. Testing for between-locus variation in alpha is one possible way of distinguishing between demography and selection. However, such tests have yielded contradictory results, and their efficacy is unclear. Estimates of alpha from the same model organisms have also varied widely. This study clarifies the reasons for these discrepancies, identifying several method-specific biases in widely used estimators and assessing the power of the methods. As part of this process, a new maximum-likelihood estimator is introduced. This estimator is applied to a newly compiled data set of 115 genes from Drosophila simulans, each with each orthologs from D. melanogaster and D. yakuba. In this way, it is estimated that alpha approximately 0.4+/-0.1, a value that does not vary substantially between different loci or over different periods of divergence. The implications of these results are discussed.
当多个基因座的多态性和分化数据可用时,麦克唐纳-克赖特曼检验的扩展形式可用于估计由于适应性进化导致的氨基酸分化的平均比例——一个记为α的统计量。但此类检验存在许多偏差。最严重的是,α的高估计值可能反映的是种群统计学变化而非适应性替代。检测α在基因座间的变异是区分种群统计学和选择的一种可能方法。然而,此类检验得出了相互矛盾的结果,其效力尚不清楚。来自相同模式生物的α估计值也有很大差异。本研究阐明了这些差异的原因,识别了广泛使用的估计方法中几种特定于方法的偏差,并评估了这些方法的效力。作为这一过程的一部分,引入了一种新的最大似然估计方法。该估计方法应用于一个新编制的数据集,该数据集包含来自拟果蝇的115个基因,每个基因都有来自黑腹果蝇和雅库布果蝇的直系同源基因。通过这种方式,估计α约为0.4±0.1,该值在不同基因座之间或不同分化时期没有显著变化。讨论了这些结果的意义。