• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

主成分分析的谱系学解释

A genealogical interpretation of principal components analysis.

作者信息

McVean Gil

机构信息

Department of Statistics, University of Oxford, Oxford, United Kingdom.

出版信息

PLoS Genet. 2009 Oct;5(10):e1000686. doi: 10.1371/journal.pgen.1000686. Epub 2009 Oct 16.

DOI:10.1371/journal.pgen.1000686
PMID:19834557
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2757795/
Abstract

Principal components analysis, PCA, is a statistical method commonly used in population genetics to identify structure in the distribution of genetic variation across geographical location and ethnic background. However, while the method is often used to inform about historical demographic processes, little is known about the relationship between fundamental demographic parameters and the projection of samples onto the primary axes. Here I show that for SNP data the projection of samples onto the principal components can be obtained directly from considering the average coalescent times between pairs of haploid genomes. The result provides a framework for interpreting PCA projections in terms of underlying processes, including migration, geographical isolation, and admixture. I also demonstrate a link between PCA and Wright's f(st) and show that SNP ascertainment has a largely simple and predictable effect on the projection of samples. Using examples from human genetics, I discuss the application of these results to empirical data and the implications for inference.

摘要

主成分分析(PCA)是群体遗传学中常用的一种统计方法,用于识别跨地理位置和种族背景的遗传变异分布中的结构。然而,尽管该方法经常用于了解历史人口统计过程,但对于基本人口统计参数与样本在主轴上的投影之间的关系却知之甚少。在这里,我表明对于单核苷酸多态性(SNP)数据,样本在主成分上的投影可以直接通过考虑成对单倍体基因组之间的平均合并时间来获得。该结果提供了一个框架,用于根据潜在过程(包括迁移、地理隔离和混合)来解释PCA投影。我还展示了PCA与赖特氏Fst之间的联系,并表明SNP确定对样本投影有很大程度上简单且可预测的影响。通过人类遗传学的例子,我讨论了这些结果在实证数据中的应用以及对推断的影响。

相似文献

1
A genealogical interpretation of principal components analysis.主成分分析的谱系学解释
PLoS Genet. 2009 Oct;5(10):e1000686. doi: 10.1371/journal.pgen.1000686. Epub 2009 Oct 16.
2
How do SNP ascertainment schemes and population demographics affect inferences about population history?单核苷酸多态性(SNP)确定方案和人口统计学如何影响对人口历史的推断?
BMC Genomics. 2015 Apr 3;16(1):266. doi: 10.1186/s12864-015-1469-5.
3
A spectral theory for Wright's inbreeding coefficients and related quantities.Wright 近交系数及相关量的谱理论。
PLoS Genet. 2021 Jul 19;17(7):e1009665. doi: 10.1371/journal.pgen.1009665. eCollection 2021 Jul.
4
Coalescents and genealogical structure under neutrality.中性条件下的溯祖过程与谱系结构
Annu Rev Genet. 1995;29:401-21. doi: 10.1146/annurev.ge.29.120195.002153.
5
Assessing the power of principal components and wright's fixation index analyzes applied to reveal the genome-wide genetic differences between herds of Holstein cows.评估主成分和 Wright 的固定指数分析的功效,应用于揭示荷斯坦奶牛群体间的全基因组遗传差异。
BMC Genet. 2020 Apr 28;21(1):47. doi: 10.1186/s12863-020-00848-0.
6
Principal components analysis of population admixture.群体混合的主成分分析。
PLoS One. 2012;7(7):e40115. doi: 10.1371/journal.pone.0040115. Epub 2012 Jul 9.
7
Exploring Population Structure with Admixture Models and Principal Component Analysis.探讨混合模型和主成分分析的群体结构。
Methods Mol Biol. 2020;2090:67-86. doi: 10.1007/978-1-0716-0199-0_4.
8
Eigenanalysis of SNP data with an identity by descent interpretation.基于血缘一致性解释的单核苷酸多态性(SNP)数据特征分析。
Theor Popul Biol. 2016 Feb;107:65-76. doi: 10.1016/j.tpb.2015.09.004. Epub 2015 Oct 23.
9
Reconstructing Past Admixture Processes from Local Genomic Ancestry Using Wavelet Transformation.利用小波变换从局部基因组祖先重建过去的混合过程。
Genetics. 2015 Jun;200(2):469-81. doi: 10.1534/genetics.115.176842. Epub 2015 Apr 7.
10
Theoretical formulation of principal components analysis to detect and correct for population stratification.主成分分析检测和校正群体分层的理论公式。
PLoS One. 2010 Sep 17;5(9):e12510. doi: 10.1371/journal.pone.0012510.

引用本文的文献

1
randPedPCA: rapid approximation of principal components from large pedigrees.randPedPCA:从大型家系中快速近似主成分
Genet Sel Evol. 2025 Aug 28;57(1):46. doi: 10.1186/s12711-025-00994-y.
2
Functional genomics of trypanotolerant and trypanosusceptible cattle infected with Trypanosoma congolense across multiple time points and tissues.在多个时间点和组织中感染刚果锥虫的耐锥虫和易感染锥虫的牛的功能基因组学
PLoS Negl Trop Dis. 2025 Aug 4;19(8):e0012882. doi: 10.1371/journal.pntd.0012882. eCollection 2025 Aug.
3
Decoding past microbial life and antibiotic resistance in İnonü Cave's archaeological soil.

本文引用的文献

1
Genes mirror geography within Europe.基因反映了欧洲内部的地理特征。
Nature. 2008 Nov 6;456(7218):98-101. doi: 10.1038/nature07331. Epub 2008 Aug 31.
2
Principal component analysis of genetic data.遗传数据的主成分分析
Nat Genet. 2008 May;40(5):491-2. doi: 10.1038/ng0508-491.
3
Interpreting principal component analyses of spatial population genetic variation.解读空间群体遗传变异的主成分分析
解读伊诺努洞穴考古土壤中过去的微生物生命和抗生素耐药性。
PLoS One. 2025 Jul 31;20(7):e0326358. doi: 10.1371/journal.pone.0326358. eCollection 2025.
4
Gene-Based Burden Testing of Rare Variants in Hemiplegic Migraine: A Computational Approach to Uncover the Genetic Architecture of a Rare Brain Disorder.偏瘫性偏头痛中罕见变异的基于基因的负荷测试:一种揭示罕见脑部疾病遗传结构的计算方法。
Genes (Basel). 2025 Jul 9;16(7):807. doi: 10.3390/genes16070807.
5
Suturing fragmented landscapes: Mosaic hybrid zones in plants may facilitate ecosystem resiliency.缝合破碎景观:植物中的镶嵌杂交带可能促进生态系统恢复力。
Proc Natl Acad Sci U S A. 2025 Aug 5;122(31):e2410941122. doi: 10.1073/pnas.2410941122. Epub 2025 Jul 28.
6
Admixed and single-continental genome segments of the same ancestry have distinct linkage disequilibrium patterns.具有相同祖先的混合和单一大陆基因组片段具有不同的连锁不平衡模式。
Genome Biol. 2025 Jul 11;26(1):201. doi: 10.1186/s13059-025-03672-w.
7
Without the locals' aid: no evidence for a role of admixture in the colonisation success of Italian wall lizards.没有当地蜥蜴的帮助:没有证据表明混合在意大利壁蜥定殖成功中发挥作用。
Oecologia. 2025 Jul 8;207(7):125. doi: 10.1007/s00442-025-05769-2.
8
Evolutionary Influences on Local Patterns of Genetic Relatedness.进化对遗传相关性局部模式的影响。
bioRxiv. 2025 May 28:2025.05.02.651970. doi: 10.1101/2025.05.02.651970.
9
Two distinct host-specialized fungal species cause white-nose disease in bats.两种不同的宿主特异性真菌物种导致蝙蝠患上白鼻病。
Nature. 2025 May 28. doi: 10.1038/s41586-025-09060-5.
10
A probabilistic approach to visualize the effect of missing data on PCA in ancient human genomics.一种可视化古代人类基因组学中缺失数据对主成分分析影响的概率方法。
BMC Genomics. 2025 May 27;26(1):537. doi: 10.1186/s12864-025-11728-1.
Nat Genet. 2008 May;40(5):646-9. doi: 10.1038/ng.139. Epub 2008 Apr 20.
4
Population structure and eigenanalysis.群体结构与特征分析
PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.
5
Principal components analysis corrects for stratification in genome-wide association studies.主成分分析可校正全基因组关联研究中的分层现象。
Nat Genet. 2006 Aug;38(8):904-9. doi: 10.1038/ng1847. Epub 2006 Jul 23.
6
The fate of mutations surfing on the wave of a range expansion.在范围扩张浪潮中随波逐流的突变的命运。
Mol Biol Evol. 2006 Mar;23(3):482-90. doi: 10.1093/molbev/msj057. Epub 2005 Nov 9.
7
A haplotype map of the human genome.人类基因组单倍型图谱。
Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.
8
Calibrating a coalescent simulation of human genome sequence variation.校准人类基因组序列变异的合并模拟。
Genome Res. 2005 Nov;15(11):1576-83. doi: 10.1101/gr.3709305.
9
The effect of the Neolithic expansion on European molecular diversity.新石器时代扩张对欧洲分子多样性的影响。
Proc Biol Sci. 2005 Apr 7;272(1564):679-88. doi: 10.1098/rspb.2004.2999.
10
A genealogical interpretation of linkage disequilibrium.连锁不平衡的系谱学解释。
Genetics. 2002 Oct;162(2):987-91. doi: 10.1093/genetics/162.2.987.