群体结构与特征分析

Population structure and eigenanalysis.

作者信息

Patterson Nick, Price Alkes L, Reich David

机构信息

Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America.

出版信息

PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

DOI:10.1371/journal.pgen.0020190

PMID:17194218

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1713260/

Abstract

Current methods for inferring population structure from genetic data do not provide formal significance tests for population differentiation. We discuss an approach to studying population structure (principal components analysis) that was first applied to genetic data by Cavalli-Sforza and colleagues. We place the method on a solid statistical footing, using results from modern statistics to develop formal significance tests. We also uncover a general "phase change" phenomenon about the ability to detect structure in genetic data, which emerges from the statistical theory we use, and has an important implication for the ability to discover structure in genetic data: for a fixed but large dataset size, divergence between two populations (as measured, for example, by a statistic like FST) below a threshold is essentially undetectable, but a little above threshold, detection will be easy. This means that we can predict the dataset size needed to detect structure.

摘要

目前从基因数据推断群体结构的方法并未提供针对群体分化的形式化显著性检验。我们讨论一种研究群体结构的方法（主成分分析），该方法最初由卡瓦利 - 斯福扎及其同事应用于基因数据。我们利用现代统计学的结果来开发形式化显著性检验，从而将该方法置于坚实的统计基础之上。我们还揭示了一个关于在基因数据中检测结构能力的普遍“相变”现象，这一现象源自我们所使用的统计理论，并且对于在基因数据中发现结构的能力具有重要意义：对于固定但较大的数据集规模，两个群体之间的差异（例如，通过像FST这样的统计量来衡量）低于某个阈值时基本上无法检测到，但略高于阈值时，检测就会很容易。这意味着我们可以预测检测结构所需的数据集规模。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47da/1756899/22a75b3bad3c/pgen.0020190.g001.jpg

相似文献

Population structure and eigenanalysis.

PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

Principal component analysis under population genetic models of range expansion and admixture.

Mol Biol Evol. 2010 Jun;27(6):1257-68. doi: 10.1093/molbev/msq010. Epub 2010 Jan 21.

A spectral theory for Wright's inbreeding coefficients and related quantities.

PLoS Genet. 2021 Jul 19;17(7):e1009665. doi: 10.1371/journal.pgen.1009665. eCollection 2021 Jul.

Population genetics, diversity and forensic characteristics of Tai-Kadai-speaking Bouyei revealed by insertion/deletions markers.

Mol Genet Genomics. 2019 Oct;294(5):1343-1357. doi: 10.1007/s00438-019-01584-6. Epub 2019 Jun 13.

Interview with Luigi Luca Cavalli-Sforza: past research and directions for future investigations in human population genetics. Interview by Franz Manni.

Hum Biol. 2010 Jun;82(3):245-66. doi: 10.3378/027.082.0301.

An analytical comparison of the principal component method and the mixed effects model for association studies in the presence of cryptic relatedness and population stratification.

Hum Hered. 2013;76(1):1-9. doi: 10.1159/000353345. Epub 2013 Jul 31.

Screening and replication using the same data set: testing strategies for family-based studies in which all probands are affected.

PLoS Genet. 2008 Sep 19;4(9):e1000197. doi: 10.1371/journal.pgen.1000197.

A non-parametric approach to population structure inference using multilocus genotypes.

Hum Genomics. 2006 Jun;2(6):353-64. doi: 10.1186/1479-7364-2-6-353.

Eigenanalysis of SNP data with an identity by descent interpretation.

Theor Popul Biol. 2016 Feb;107:65-76. doi: 10.1016/j.tpb.2015.09.004. Epub 2015 Oct 23.

Improved eigenanalysis of discrete subpopulations and admixture using the minimum average partial test.

Hum Hered. 2012;73(2):73-83. doi: 10.1159/000335899. Epub 2012 Mar 20.

引用本文的文献

Ancient genomes provide evidence of demographic shift to Slavic-associated groups in Moravia.

Genome Biol. 2025 Sep 3;26(1):259. doi: 10.1186/s13059-025-03700-9.

Ancient DNA connects large-scale migration with the spread of Slavs.

Nature. 2025 Sep 3. doi: 10.1038/s41586-025-09437-6.

Unraveling CBS Mutations and Their Clinical Impact in a Chinese Family With Classical Homocystinuria.

Mol Genet Genomic Med. 2025 Sep;13(9):e70132. doi: 10.1002/mgg3.70132.

Genetic stability in the lower Yangtze River basin from Song to Qing Dynasty.

BMC Biol. 2025 Aug 29;23(1):270. doi: 10.1186/s12915-025-02343-3.

randPedPCA: rapid approximation of principal components from large pedigrees.

Genet Sel Evol. 2025 Aug 28;57(1):46. doi: 10.1186/s12711-025-00994-y.

Genomic insights from a final Bronze Age community buried in a collective tumulus in an Urnfield settlement in Northeastern Iberia.

Commun Biol. 2025 Aug 28;8(1):1299. doi: 10.1038/s42003-025-08668-7.

Accurate Identification of Native Asian Honey Bee Populations in Jilong (Xizang, China) by Population Genomics and Deep Learning.

Insects. 2025 Jul 31;16(8):788. doi: 10.3390/insects16080788.

Genetic Analysis of Recently Discovered Least Chub Populations in the Upper Snake River and Bonneville Drainages.

Ecol Evol. 2025 Aug 22;15(8):e72017. doi: 10.1002/ece3.72017. eCollection 2025 Aug.

Response splicing quantitative trait loci in primary human chondrocytes identify putative osteoarthritis risk genes.

Nat Commun. 2025 Aug 26;16(1):7932. doi: 10.1038/s41467-025-63299-0.

Modeling the European Neolithic expansion suggests predominant within-group mating and limited cultural transmission.

Nat Commun. 2025 Aug 25;16(1):7905. doi: 10.1038/s41467-025-63172-0.

本文引用的文献

Standardized subsets of the HGDP-CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives.

Ann Hum Genet. 2006 Nov;70(Pt 6):841-7. doi: 10.1111/j.1469-1809.2006.00285.x.

Principal components analysis corrects for stratification in genome-wide association studies.

Nat Genet. 2006 Aug;38(8):904-9. doi: 10.1038/ng1847. Epub 2006 Jul 23.

Population structure in the Mediterranean basin: a Y chromosome perspective.

Ann Hum Genet. 2006 Mar;70(Pt 2):207-25. doi: 10.1111/j.1529-8817.2005.00224.x.

ON THE RELATIVE ABUNDANCE OF BIRD SPECIES.

Proc Natl Acad Sci U S A. 1957 Mar 15;43(3):293-5. doi: 10.1073/pnas.43.3.293.

Clines, clusters, and the effect of study design on the inference of human population structure.

PLoS Genet. 2005 Dec;1(6):e70. doi: 10.1371/journal.pgen.0010070. Epub 2005 Dec 9.

A haplotype map of the human genome.

Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

Calibrating a coalescent simulation of human genome sequence variation.

Genome Res. 2005 Nov;15(11):1576-83. doi: 10.1101/gr.3709305.

Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa.

Proc Natl Acad Sci U S A. 2005 Nov 1;102(44):15942-7. doi: 10.1073/pnas.0507611102. Epub 2005 Oct 21.

Population structure, differential bias and genomic control in a large-scale, case-control association study.

Nat Genet. 2005 Nov;37(11):1243-6. doi: 10.1038/ng1653. Epub 2005 Oct 9.

Large-scale SNP analysis reveals clustered and continuous patterns of human genetic variation.

Hum Genomics. 2005 Jun;2(2):81-9. doi: 10.1186/1479-7364-2-2-81.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

群体结构与特征分析

Population structure and eigenanalysis.

作者信息

Patterson Nick, Price Alkes L, Reich David

机构信息

Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America.

出版信息

PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

DOI:10.1371/journal.pgen.0020190

PMID:17194218

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1713260/

Abstract

摘要

群体结构与特征分析

Population structure and eigenanalysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

群体结构与特征分析

Population structure and eigenanalysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献