Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
Genome Res. 2011 Jul;21(7):1087-98. doi: 10.1101/gr.119792.110. Epub 2011 May 31.
Second-generation sequencing technologies allow surveys of sequence variation on an unprecedented scale. However, despite the rapid decrease in sequencing costs, collecting whole-genome sequence data on a population scale is still prohibitive for many laboratories. We have implemented an inexpensive, reduced representation protocol for preparing resequencing targets, and we have developed the analytical tools necessary for making population genetic inferences. This approach can be applied to any species for which a draft or complete reference genome sequence is available. The new tools we have developed include methods for aligning reads, calling genotypes, and incorporating sample-specific sequencing error rates in the estimate of evolutionary parameters. When applied to 19 individuals from a total of 18 human populations, our approach allowed sampling regions that are largely overlapping across individuals and that are representative of the entire genome. The resequencing data were used to test the serial founder model of human dispersal and to estimate the time of the Out of Africa migration. Our results also represent the first attempt to provide a time frame for the colonization of Australia based on large-scale resequencing data.
第二代测序技术使得对序列变异的调查达到了前所未有的规模。然而,尽管测序成本迅速下降,但对许多实验室来说,在人群规模上收集全基因组序列数据仍然是不可行的。我们已经实施了一种廉价的、减少代表性的方法来制备重测序靶标,并且已经开发了用于进行群体遗传推断的必要分析工具。这种方法可以应用于任何具有草稿或完整参考基因组序列的物种。我们开发的新工具包括用于对齐读取、调用基因型以及在进化参数估计中纳入样本特定测序错误率的方法。当应用于来自总共 18 个人类群体的 19 个个体时,我们的方法允许对个体之间基本重叠并且代表整个基因组的采样区域进行采样。重测序数据用于检验人类离散的串联创始人模型,并估计走出非洲的迁移时间。我们的结果也首次尝试基于大规模重测序数据为澳大利亚的殖民化提供一个时间框架。