François Olivier, Ancelet Sophie, Guillot Gilles
TIMC, TIMB (Department of Mathematical Biology), La Tronche, France.
Genetics. 2006 Oct;174(2):805-16. doi: 10.1534/genetics.106.059923. Epub 2006 Aug 3.
We introduce a new Bayesian clustering algorithm for studying population structure using individually geo-referenced multilocus data sets. The algorithm is based on the concept of hidden Markov random field, which models the spatial dependencies at the cluster membership level. We argue that (i) a Markov chain Monte Carlo procedure can implement the algorithm efficiently, (ii) it can detect significant geographical discontinuities in allele frequencies and regulate the number of clusters, (iii) it can check whether the clusters obtained without the use of spatial priors are robust to the hypothesis of discontinuous geographical variation in allele frequencies, and (iv) it can reduce the number of loci required to obtain accurate assignments. We illustrate and discuss the implementation issues with the Scandinavian brown bear and the human CEPH diversity panel data set.
我们介绍了一种新的贝叶斯聚类算法,用于利用个体地理定位的多位点数据集研究种群结构。该算法基于隐马尔可夫随机场的概念,它在聚类成员级别对空间依赖性进行建模。我们认为:(i)马尔可夫链蒙特卡罗方法可以有效地实现该算法,(ii)它可以检测等位基因频率中显著的地理间断并调节聚类数量,(iii)它可以检验在不使用空间先验的情况下获得的聚类对于等位基因频率的地理间断变化假设是否稳健,以及(iv)它可以减少获得准确分配所需的基因座数量。我们用斯堪的纳维亚棕熊和人类CEPH多样性面板数据集来说明并讨论实施问题。