Schaffner Stephen F, Foo Catherine, Gabriel Stacey, Reich David, Daly Mark J, Altshuler David
Program in Medical and Population Genetics, The Broad Institute, Cambridge, Massachusetts 02139, USA.
Genome Res. 2005 Nov;15(11):1576-83. doi: 10.1101/gr.3709305.
Population genetic models play an important role in human genetic research, connecting empirical observations about sequence variation with hypotheses about underlying historical and biological causes. More specifically, models are used to compare empirical measures of sequence variation, linkage disequilibrium (LD), and selection to expectations under a "null" distribution. In the absence of detailed information about human demographic history, and about variation in mutation and recombination rates, simulations have of necessity used arbitrary models, usually simple ones. With the advent of large empirical data sets, it is now possible to calibrate population genetic models with genome-wide data, permitting for the first time the generation of data that are consistent with empirical data across a wide range of characteristics. We present here the first such calibrated model and show that, while still arbitrary, it successfully generates simulated data (for three populations) that closely resemble empirical data in allele frequency, linkage disequilibrium, and population differentiation. No assertion is made about the accuracy of the proposed historical and recombination model, but its ability to generate realistic data meets a long-standing need among geneticists. We anticipate that this model, for which software is publicly available, and others like it will have numerous applications in empirical studies of human genetics.
群体遗传模型在人类遗传学研究中发挥着重要作用,它将关于序列变异的实证观察与关于潜在历史和生物学原因的假设联系起来。更具体地说,模型用于将序列变异、连锁不平衡(LD)和选择的实证测量与“零”分布下的预期进行比较。在缺乏关于人类人口历史以及突变和重组率变异的详细信息的情况下,模拟必然使用任意模型,通常是简单的模型。随着大型实证数据集的出现,现在可以用全基因组数据校准群体遗传模型,首次使得能够生成在广泛特征上与实证数据一致的数据。我们在此展示首个这样的校准模型,并表明,虽然它仍然是任意的,但它成功生成了(针对三个人群的)模拟数据,这些数据在等位基因频率、连锁不平衡和群体分化方面与实证数据非常相似。我们并未对所提出的历史和重组模型的准确性做出断言,但其生成现实数据的能力满足了遗传学家长期以来的需求。我们预计这个模型(其软件已公开可用)以及其他类似模型将在人类遗传学的实证研究中有众多应用。