使用主坐标精确推断细微的群体结构（以及其他遗传不连续性）。

Accurate inference of subtle population structure (and other genetic discontinuities) using principal coordinates.

作者信息

Reeves Patrick A, Richards Christopher M

机构信息

United States Department of Agriculture, Agricultural Research Service, National Center for Genetic Resources Preservation, Fort Collins, Colorado, United States of America.

出版信息

PLoS One. 2009;4(1):e4269. doi: 10.1371/journal.pone.0004269. Epub 2009 Jan 27.

DOI:10.1371/journal.pone.0004269

PMID:19172174

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2625398/

Abstract

BACKGROUND

Accurate inference of genetic discontinuities between populations is an essential component of intraspecific biodiversity and evolution studies, as well as associative genetics. The most widely-used methods to infer population structure are model-based, Bayesian MCMC procedures that minimize Hardy-Weinberg and linkage disequilibrium within subpopulations. These methods are useful, but suffer from large computational requirements and a dependence on modeling assumptions that may not be met in real data sets. Here we describe the development of a new approach, PCO-MC, which couples principal coordinate analysis to a clustering procedure for the inference of population structure from multilocus genotype data.

METHODOLOGY/PRINCIPAL FINDINGS: PCO-MC uses data from all principal coordinate axes simultaneously to calculate a multidimensional "density landscape", from which the number of subpopulations, and the membership within subpopulations, is determined using a valley-seeking algorithm. Using extensive simulations, we show that this approach outperforms a Bayesian MCMC procedure when many loci (e.g. 100) are sampled, but that the Bayesian procedure is marginally superior with few loci (e.g. 10). When presented with sufficient data, PCO-MC accurately delineated subpopulations with population F(st) values as low as 0.03 (G'(st)>0.2), whereas the limit of resolution of the Bayesian approach was F(st) = 0.05 (G'(st)>0.35).

CONCLUSIONS/SIGNIFICANCE: We draw a distinction between population structure inference for describing biodiversity as opposed to Type I error control in associative genetics. We suggest that discrete assignments, like those produced by PCO-MC, are appropriate for circumscribing units of biodiversity whereas expression of population structure as a continuous variable is more useful for case-control correction in structured association studies.

摘要

背景

准确推断种群间的遗传间断是种内生物多样性与进化研究以及关联遗传学的重要组成部分。推断种群结构最广泛使用的方法是基于模型的贝叶斯MCMC程序，该程序可使亚群内的哈迪-温伯格平衡和连锁不平衡最小化。这些方法很有用，但存在计算需求大以及依赖建模假设的问题，而实际数据集可能无法满足这些假设。在此，我们描述了一种新方法PCO-MC的开发，该方法将主坐标分析与聚类程序相结合，用于从多位点基因型数据推断种群结构。

方法/主要发现：PCO-MC同时使用所有主坐标轴的数据来计算多维“密度景观”，并使用谷底搜索算法从中确定亚群数量以及亚群内的成员归属。通过广泛的模拟，我们表明，当采样许多位点（例如100个）时，该方法优于贝叶斯MCMC程序，但在位点较少（例如10个）时，贝叶斯程序略胜一筹。当有足够的数据时，PCO-MC能够准确地划分出Fst值低至0.03（G'st>0.2）的亚群，而贝叶斯方法的分辨率极限是Fst = 0.05（G'st>0.35）。

结论/意义：我们区分了用于描述生物多样性的种群结构推断与关联遗传学中的I型错误控制。我们建议，像PCO-MC产生的离散分配适用于界定生物多样性单元，而将种群结构表示为连续变量对于结构化关联研究中的病例对照校正更有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4a4/2625398/fa16034f69ce/pone.0004269.g001.jpg

相似文献

Accurate inference of subtle population structure (and other genetic discontinuities) using principal coordinates.

PLoS One. 2009;4(1):e4269. doi: 10.1371/journal.pone.0004269. Epub 2009 Jan 27.

Improving the inference of population genetic structure in the presence of related individuals.

Genet Res (Camb). 2014;96:e003. doi: 10.1017/S0016672314000068.

The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.

Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28.

Inference of population structure under a Dirichlet process model.

Genetics. 2007 Apr;175(4):1787-802. doi: 10.1534/genetics.106.061317. Epub 2007 Jan 21.

Characterization of a Bayesian genetic clustering algorithm based on a Dirichlet process prior and comparison among Bayesian clustering methods.

BMC Bioinformatics. 2011 Jun 28;12:263. doi: 10.1186/1471-2105-12-263.

A spatial statistical model for landscape genetics.

Genetics. 2005 Jul;170(3):1261-80. doi: 10.1534/genetics.104.033803. Epub 2004 Nov 1.

Bayesian inference of recent migration rates using multilocus genotypes.

Genetics. 2003 Mar;163(3):1177-91. doi: 10.1093/genetics/163.3.1177.

A non-parametric approach to population structure inference using multilocus genotypes.

Hum Genomics. 2006 Jun;2(6):353-64. doi: 10.1186/1479-7364-2-6-353.

Empirical Bayes inference of pairwise F(ST) and its distribution in the genome.

Genetics. 2007 Oct;177(2):861-73. doi: 10.1534/genetics.107.077263. Epub 2007 Jul 29.

Human population structure detection via multilocus genotype clustering.

BMC Genet. 2007 Jun 25;8:34. doi: 10.1186/1471-2156-8-34.

引用本文的文献

Coalescent-Based Species Delimitation in Herbaceous Bamboos (Bambusoideae, Olyreae) from Eastern Brazil: Implications for Taxonomy and Conservation in a Group with Weak Morphological Divergence Coupled with Low Genetic Diversity.

Plants (Basel). 2022 Dec 26;12(1):107. doi: 10.3390/plants12010107.

Genetic fingerprinting and aflatoxin production of Aspergillus section Flavi associated with groundnut in eastern Ethiopia.

BMC Microbiol. 2021 Aug 28;21(1):239. doi: 10.1186/s12866-021-02290-3.

Influence of Environmental Factors on the Genetic and Chemical Diversity of Populations Growing in Fragmented Shrublands from Mexico.

Plants (Basel). 2021 Feb 8;10(2):325. doi: 10.3390/plants10020325.

Molecular Evidence for Two Domestication Events in the Pea Crop.

Genes (Basel). 2018 Nov 6;9(11):535. doi: 10.3390/genes9110535.

Can asexuality confer a short-term advantage? Investigating apparent biogeographic success in the apomictic triploid fern Myriopteris gracilis.

Am J Bot. 2017 Aug;104(8):1254-1265. doi: 10.3732/ajb.1700126.

Genetic diversity of Atlantic Bluefin tuna in the Mediterranean Sea: insights from genome-wide SNPs and microsatellites.

J Biol Res (Thessalon). 2017 Feb 16;24:3. doi: 10.1186/s40709-017-0062-2. eCollection 2017 Dec.

Next-generation sampling: Pairing genomics with herbarium specimens provides species-level signal in Solidago (Asteraceae).

Appl Plant Sci. 2015 Jun 8;3(6). doi: 10.3732/apps.1500014. eCollection 2015 Jun.

Detecting individual ancestry in the human genome.

Investig Genet. 2015 May 1;6:7. doi: 10.1186/s13323-015-0019-x. eCollection 2015.

Genetic evidence for reproductive isolation among sympatric Epichloë endophytes as inferred from newly developed microsatellite markers.

Microb Ecol. 2015 Jul;70(1):51-60. doi: 10.1007/s00248-014-0556-5. Epub 2014 Dec 28.

Increased genetic divergence between two closely related fir species in areas of range overlap.

Ecol Evol. 2014 Apr;4(7):1019-29. doi: 10.1002/ece3.1007. Epub 2014 Mar 3.

本文引用的文献

ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE.

Evolution. 1984 Nov;38(6):1358-1370. doi: 10.1111/j.1558-5646.1984.tb05657.x.

The genetical structure of populations.

Ann Eugen. 1951 Mar;15(4):323-54. doi: 10.1111/j.1469-1809.1949.tb02451.x.

Molecular and cytological examination of Calopogon (Orchidaceae, Epidendroideae): circumscription, phylogeny, polyploidy, and possible hybrid speciation.

Am J Bot. 2004 May;91(5):707-23. doi: 10.3732/ajb.91.5.707.

PLoS Genet. 2007 Sep;3(9):1672-86. doi: 10.1371/journal.pgen.0030160.

Molecular genetic relationships of the extinct dusky seaside sparrow.

Science. 1989 Feb 3;243(4891):646-8. doi: 10.1126/science.243.4891.646.

Distinguishing terminal monophyletic groups from reticulate taxa: performance of phenetic, tree-based, and network procedures.

Syst Biol. 2007 Apr;56(2):302-20. doi: 10.1080/10635150701324225.

Measuring European population stratification with microarray genotype data.

Am J Hum Genet. 2007 May;80(5):948-56. doi: 10.1086/513477. Epub 2007 Mar 22.

Inference of population structure under a Dirichlet process model.

Genetics. 2007 Apr;175(4):1787-802. doi: 10.1534/genetics.106.061317. Epub 2007 Jan 21.

Population structure and eigenanalysis.

PLoS Genet. 2006 Dec;2(12):e190. doi: 10.1371/journal.pgen.0020190.

Identification of management units using population genetic data.

Trends Ecol Evol. 2007 Jan;22(1):11-6. doi: 10.1016/j.tree.2006.09.003. Epub 2006 Sep 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用主坐标精确推断细微的群体结构（以及其他遗传不连续性）。

Accurate inference of subtle population structure (and other genetic discontinuities) using principal coordinates.

作者信息

机构信息

出版信息

BACKGROUND

背景

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献