INRA, UMR1313 GABI, Jouy-en-Josas, France.
PLoS One. 2010 Aug 2;5(8):e11913. doi: 10.1371/journal.pone.0011913.
The recent advent of high-throughput SNP genotyping technologies has opened new avenues of research for population genetics. In particular, a growing interest in the identification of footprints of selection, based on genome scans for adaptive differentiation, has emerged.
METHODOLOGY/PRINCIPAL FINDINGS: The purpose of this study is to develop an efficient model-based approach to perform bayesian exploratory analyses for adaptive differentiation in very large SNP data sets. The basic idea is to start with a very simple model for neutral loci that is easy to implement under a bayesian framework and to identify selected loci as outliers via Posterior Predictive P-values (PPP-values). Applications of this strategy are considered using two different statistical models. The first one was initially interpreted in the context of populations evolving respectively under pure genetic drift from a common ancestral population while the second one relies on populations under migration-drift equilibrium. Robustness and power of the two resulting bayesian model-based approaches to detect SNP under selection are further evaluated through extensive simulations. An application to a cattle data set is also provided.
CONCLUSIONS/SIGNIFICANCE: The procedure described turns out to be much faster than former bayesian approaches and also reasonably efficient especially to detect loci under positive selection.
高通量 SNP 基因分型技术的出现为群体遗传学的研究开辟了新的途径。特别是,基于对适应性分化的基因组扫描来识别选择痕迹的兴趣日益浓厚。
方法/主要发现:本研究旨在开发一种有效的基于模型的方法,对非常大的 SNP 数据集进行适应性分化的贝叶斯探索性分析。基本思想是从一个非常简单的中性位点模型开始,该模型易于在贝叶斯框架下实现,并通过后验预测概率值(PPP 值)将选择的位点识别为异常值。该策略的应用考虑了两种不同的统计模型。第一个模型最初是在从共同祖先群体中分别经历纯遗传漂变的群体的背景下解释的,而第二个模型则依赖于处于迁移-漂变平衡的群体。通过广泛的模拟进一步评估了这两种基于贝叶斯模型的方法检测受选择影响的 SNP 的稳健性和功效。还提供了对牛数据集的应用。
结论/意义:所描述的过程比以前的贝叶斯方法快得多,并且特别有效地检测到受正选择影响的位点。