Mathieson Iain, McVean Gil
Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom.
PLoS Genet. 2014 Aug 7;10(8):e1004528. doi: 10.1371/journal.pgen.1004528. eCollection 2014 Aug.
Large whole-genome sequencing projects have provided access to much rare variation in human populations, which is highly informative about population structure and recent demography. Here, we show how the age of rare variants can be estimated from patterns of haplotype sharing and how these ages can be related to historical relationships between populations. We investigate the distribution of the age of variants occurring exactly twice (ƒ(2) variants) in a worldwide sample sequenced by the 1000 Genomes Project, revealing enormous variation across populations. The median age of haplotypes carrying ƒ(2) variants is 50 to 160 generations across populations within Europe or Asia, and 170 to 320 generations within Africa. Haplotypes shared between continents are much older with median ages for haplotypes shared between Europe and Asia ranging from 320 to 670 generations. The distribution of the ages of ƒ(2) haplotypes is informative about their demography, revealing recent bottlenecks, ancient splits, and more modern connections between populations. We see the effect of selection in the observation that functional variants are significantly younger than nonfunctional variants of the same frequency. This approach is relatively insensitive to mutation rate and complements other nonparametric methods for demographic inference.
大型全基因组测序项目已使人们能够获取人类群体中大量的罕见变异,这些变异对于群体结构和近期人口统计学具有很高的信息价值。在此,我们展示了如何从单倍型共享模式估计罕见变异的年龄,以及这些年龄如何与群体之间的历史关系相关联。我们研究了在千人基因组计划测序的全球样本中恰好出现两次的变异(ƒ(2)变异)的年龄分布,揭示了不同群体间存在巨大差异。在欧洲或亚洲的群体中,携带ƒ(2)变异的单倍型的中位年龄为50至160代,而在非洲群体中为170至320代。各大洲之间共享的单倍型要古老得多,欧洲和亚洲之间共享的单倍型的中位年龄在320至670代之间。ƒ(2)单倍型年龄的分布为其人口统计学提供了信息,揭示了近期的瓶颈效应、古代的分化以及群体之间更现代的联系。我们在观察中看到了选择的影响,即功能变异比相同频率的非功能变异明显更年轻。这种方法对突变率相对不敏感,并补充了其他用于人口统计学推断的非参数方法。