Department of Biology II, Ludwig-Maximilians-University Munich, 82152 Planegg, Germany.
Genetics. 2010 Jul;185(3):907-22. doi: 10.1534/genetics.110.116459. Epub 2010 Apr 20.
A major goal of population genomics is to reconstruct the history of natural populations and to infer the neutral and selective scenarios that can explain the present-day polymorphism patterns. However, the separation between neutral and selective hypotheses has proven hard, mainly because both may predict similar patterns in the genome. This study focuses on the development of methods that can be used to distinguish neutral from selective hypotheses in equilibrium and nonequilibrium populations. These methods utilize a combination of statistics on the basis of the site frequency spectrum (SFS) and linkage disequilibrium (LD). We investigate the patterns of genetic variation along recombining chromosomes using a multitude of comparisons between neutral and selective hypotheses, such as selection or neutrality in equilibrium and nonequilibrium populations and recurrent selection models. We perform hypothesis testing using the classical P-value approach, but we also introduce methods from the machine-learning field. We demonstrate that the combination of SFS- and LD-based statistics increases the power to detect recent positive selection in populations that have experienced past demographic changes.
群体基因组学的主要目标是重建自然群体的历史,并推断能够解释当今多态性模式的中性和选择情景。然而,中性和选择假设之间的分离一直很难证明,主要是因为两者都可能预测基因组中相似的模式。本研究侧重于开发可用于区分平衡和非平衡群体中中性和选择假设的方法。这些方法利用基于位点频率谱 (SFS) 和连锁不平衡 (LD) 的统计数据的组合。我们使用中性和选择假设之间的多种比较来研究重组染色体上的遗传变异模式,例如平衡和非平衡种群中的选择或中性以及反复选择模型。我们使用经典的 P 值方法进行假设检验,但我们也引入了机器学习领域的方法。我们证明,基于 SFS 和 LD 的统计数据的组合增加了在经历过过去人口变化的群体中检测近期正选择的能力。