Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
1000 基因组计划旨在深入描述人类基因组序列变异,以此作为研究基因型与表型之间关系的基础。在此,我们呈现该计划先导阶段的研究结果,旨在开发和比较不同的策略,利用高通量平台进行全基因组测序。我们开展了三个项目:对来自四个群体的 179 个人进行低覆盖率全基因组测序;对两个母子-父子三人组进行高覆盖率测序;对来自七个群体的 697 个人进行外显子靶向测序。我们描述了约 1500 万个单核苷酸多态性、100 万个短插入和缺失以及 20000 个结构变异的位置、等位基因频率和局部单倍型结构,其中大多数是以前未描述的。我们表明,由于我们已经编目了绝大多数常见变异,因此目前在任何个体中可获得的变异中,有超过 95%都存在于这个数据集。平均而言,每个人被发现携带大约 250 到 300 个在注释基因中失活的变异,以及 50 到 100 个先前与遗传性疾病有关的变异。我们展示了如何利用这些结果来指导关联和功能研究。从这两个三人组中,我们直接估计新生种系碱基替换突变的发生率约为每个碱基对每代 10(-8)。我们探讨了数据中自然选择的特征,并确定了由于连锁位点的选择,基因周围的遗传变异明显减少。这些方法和公共数据将支持人类遗传研究的下一阶段。
Nature. 2010-10-28
Nature. 2015-10-1
Nature. 2012-11-1
Nature. 2010-10-28
Nature. 2005-10-27
Gigascience. 2025-1-6
Future Sci OA. 2025-12
Medicine (Baltimore). 2025-8-22
BMC Plant Biol. 2025-8-25
Imaging Neurosci (Camb). 2025-5-16
Genet Epidemiol. 2010-12
Genome Res. 2010-10-27
N Engl J Med. 2010-10-13
Nat Rev Genet. 2010-7