Evanno G, Regnaut S, Goudet J
Department of Ecology and Evolution, Biology building, University of Lausanne, CH 1015 Lausanne, Switzerland.
Mol Ecol. 2005 Jul;14(8):2611-20. doi: 10.1111/j.1365-294X.2005.02553.x.
The identification of genetically homogeneous groups of individuals is a long standing issue in population genetics. A recent Bayesian algorithm implemented in the software STRUCTURE allows the identification of such groups. However, the ability of this algorithm to detect the true number of clusters (K) in a sample of individuals when patterns of dispersal among populations are not homogeneous has not been tested. The goal of this study is to carry out such tests, using various dispersal scenarios from data generated with an individual-based model. We found that in most cases the estimated 'log probability of data' does not provide a correct estimation of the number of clusters, K. However, using an ad hoc statistic DeltaK based on the rate of change in the log probability of data between successive K values, we found that STRUCTURE accurately detects the uppermost hierarchical level of structure for the scenarios we tested. As might be expected, the results are sensitive to the type of genetic marker used (AFLP vs. microsatellite), the number of loci scored, the number of populations sampled, and the number of individuals typed in each sample.
识别基因同质的个体群体是群体遗传学中长期存在的问题。软件STRUCTURE中最近实施的贝叶斯算法能够识别此类群体。然而,当群体间的扩散模式不均一,该算法在个体样本中检测聚类真实数量(K)的能力尚未得到检验。本研究的目的是利用基于个体模型生成的数据中的各种扩散情景进行此类检验。我们发现,在大多数情况下,估计的“数据对数概率”并不能正确估计聚类数量K。然而,使用基于连续K值之间数据对数概率变化率的特设统计量DeltaK,我们发现STRUCTURE能够准确检测我们所测试情景的最高层次结构。正如预期的那样,结果对所用遗传标记类型(AFLP与微卫星)、计分位点数量、抽样群体数量以及每个样本中分型个体数量敏感。