• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

个体分配至群体的计算机程序结构:易于使用但更易被滥用。

The computer program structure for assigning individuals to populations: easy to use but easier to misuse.

机构信息

Institute of Zoology, Zoological Society of London, London, NW1 4RY, UK.

出版信息

Mol Ecol Resour. 2017 Sep;17(5):981-990. doi: 10.1111/1755-0998.12650. Epub 2017 Feb 7.

DOI:10.1111/1755-0998.12650
PMID:28028941
Abstract

The computer program Structure implements a Bayesian method, based on a population genetics model, to assign individuals to their source populations using genetic marker data. It is widely applied in the fields of ecology, evolutionary biology, human genetics and conservation biology for detecting hidden genetic structures, inferring the most likely number of populations (K), assigning individuals to source populations and estimating admixture and migration rates. Recently, several simulation studies repeatedly concluded that the program yields erroneous inferences when samples from different populations are highly unbalanced in size. Analysing both simulated and empirical data sets, this study confirms that Structure indeed yields poor individual assignments to source populations and gives frequently incorrect estimates of K when sampling is unbalanced. However, this poor performance is mainly caused by the adoption of the default ancestry prior, which assumes all source populations contribute equally to the pooled sample of individuals. When the alternative ancestry prior, which allows for unequal representations of the source populations by the sample, is adopted, accurate individual assignments could be obtained even if sampling is highly unbalanced. The alternative prior also improves the inference of K by two estimators, albeit the improvement is not as much as that in individual assignments to populations. For the difficult case of many populations and unbalanced sampling, a rarely used parameter combination of the alternative ancestry prior, an initial ALPHA value much smaller than the default and the uncorrelated allele frequency model is required for Structure to yield accurate inferences. I conclude that Structure is easy to use but is easier to misuse because of its complicated genetic model and many parameter (prior) options which may not be obvious to choose, and suggest using multiple plausible models (parameters) and K estimators in conducting comparative and exploratory Structure analysis.

摘要

计算机程序 Structure 实现了一种基于群体遗传学模型的贝叶斯方法,可利用遗传标记数据将个体分配到其来源群体。它广泛应用于生态学、进化生物学、人类遗传学和保护生物学领域,用于检测隐藏的遗传结构、推断最可能的群体数量 (K)、将个体分配到来源群体以及估计混合和迁移率。最近,几项模拟研究反复得出结论,当来自不同群体的样本在大小上高度不平衡时,该程序会产生错误的推断。本研究通过分析模拟和实际数据集,证实了 Structure 确实会导致个体对来源群体的分配较差,并经常对 K 给出不正确的估计,特别是在采样不平衡时。然而,这种较差的性能主要是由于采用了默认的祖先先验,该先验假设所有来源群体都平等地为个体的混合样本做出贡献。当采用允许样本中来源群体的代表性不平等的替代祖先先验时,即使采样高度不平衡,也可以获得准确的个体分配。替代先验还通过两个估计器改进了 K 的推断,尽管改进程度不如对种群的个体分配。对于许多群体和采样不平衡的困难情况,需要采用替代祖先先验的一个很少使用的参数组合,即初始 ALPHA 值远小于默认值和非相关等位基因频率模型,才能使 Structure 产生准确的推断。我得出的结论是,Structure 易于使用,但由于其复杂的遗传模型和许多参数(先验)选项,可能不太容易选择,因此更容易被滥用,并建议在进行比较和探索性 Structure 分析时使用多个合理的模型(参数)和 K 估计器。

相似文献

1
The computer program structure for assigning individuals to populations: easy to use but easier to misuse.个体分配至群体的计算机程序结构:易于使用但更易被滥用。
Mol Ecol Resour. 2017 Sep;17(5):981-990. doi: 10.1111/1755-0998.12650. Epub 2017 Feb 7.
2
The program structure does not reliably recover the correct population structure when sampling is uneven: subsampling and new estimators alleviate the problem.当采样不均匀时,程序结构不能可靠地恢复正确的种群结构:子采样和新的估计器缓解了这个问题。
Mol Ecol Resour. 2016 May;16(3):608-27. doi: 10.1111/1755-0998.12512. Epub 2016 Mar 2.
3
A parsimony estimator of the number of populations from a STRUCTURE-like analysis.一种基于 STRUCTURE 分析的简约种群数量估计器。
Mol Ecol Resour. 2019 Jul;19(4):970-981. doi: 10.1111/1755-0998.13000. Epub 2019 May 5.
4
StructureSelector: A web-based software to select and visualize the optimal number of clusters using multiple methods.结构选择器:一个基于网络的软件,可使用多种方法选择和可视化最佳的聚类数量。
Mol Ecol Resour. 2018 Jan;18(1):176-177. doi: 10.1111/1755-0998.12719. Epub 2017 Oct 9.
5
Sampling schemes and drift can bias admixture proportions inferred by structure.采样方案和漂变可能会影响结构推断的混合比例。
Mol Ecol Resour. 2020 Nov;20(6):1769-1785. doi: 10.1111/1755-0998.13234. Epub 2020 Aug 10.
6
The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.近亲对群体遗传结构分析中无监督贝叶斯聚类算法的影响。
Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28.
7
Characterization of a Bayesian genetic clustering algorithm based on a Dirichlet process prior and comparison among Bayesian clustering methods.基于狄利克雷过程先验的贝叶斯遗传聚类算法的特征描述及贝叶斯聚类方法的比较。
BMC Bioinformatics. 2011 Jun 28;12:263. doi: 10.1186/1471-2105-12-263.
8
Individual identification from genetic marker data: developments and accuracy comparisons of methods.基于遗传标记数据的个体识别:方法的发展与准确性比较
Mol Ecol Resour. 2016 Jan;16(1):163-75. doi: 10.1111/1755-0998.12452. Epub 2015 Aug 20.
9
Identifying the number of population clusters with structure: problems and solutions.识别具有结构的群体数量:问题与解决方案。
Mol Ecol Resour. 2016 May;16(3):601-3. doi: 10.1111/1755-0998.12521.
10
A Continuous Correlated Beta Process Model for Genetic Ancestry in Admixed Populations.混合群体中遗传血统的连续相关贝塔过程模型
PLoS One. 2016 Mar 11;11(3):e0151047. doi: 10.1371/journal.pone.0151047. eCollection 2016.

引用本文的文献

1
Insights Into the Almond Domestication History.杏仁驯化历史的见解
Evol Appl. 2025 Aug 31;18(9):e70150. doi: 10.1111/eva.70150. eCollection 2025 Sep.
2
Diversity of Environmental Escherichia coli in Subtropical Freshwater Systems of South Africa.南非亚热带淡水系统中环境大肠杆菌的多样性
Curr Microbiol. 2025 Jul 28;82(9):414. doi: 10.1007/s00284-025-04402-y.
3
Revealing the range of equally likely estimates in the admixture model.揭示混合模型中等可能性估计值的范围。
G3 (Bethesda). 2025 Aug 6;15(8). doi: 10.1093/g3journal/jkaf142.
4
Genetic analysis and phytochemical profile of soursop (Annona muricata L.) cultivated in family orchards in southeastern Mexico.墨西哥东南部家庭果园种植的刺果番荔枝(番荔枝科番荔枝属)的遗传分析与植物化学特征
PLoS One. 2025 May 7;20(5):e0321846. doi: 10.1371/journal.pone.0321846. eCollection 2025.
5
What Do We Gain When Tolerating Loss? The Information Bottleneck Wrings Out Recombination.容忍损失时我们能获得什么?信息瓶颈消除了重组。
Mol Biol Evol. 2025 Mar 5;42(3). doi: 10.1093/molbev/msaf029.
6
Maintenance of Genetic Diversity Despite Population Fluctuations in the Lesser Prairie-Chicken ().尽管小草原松鸡种群数量波动,但仍维持遗传多样性()。
Ecol Evol. 2025 Jan 23;15(1):e70879. doi: 10.1002/ece3.70879. eCollection 2025 Jan.
7
Reduced Representation and Whole-Genome Sequencing Approaches Highlight Beluga Whale Populations Associated to Eastern Canada Summer Aggregations.简化基因组和全基因组测序方法突显了与加拿大东部夏季聚集区相关的白鲸种群。
Evol Appl. 2024 Dec 18;17(12):e70058. doi: 10.1111/eva.70058. eCollection 2024 Dec.
8
Genomic patterns in the dwarf kingfishers of northern Melanesia reveal a mechanistic framework explaining the paradox of the great speciators.美拉尼西亚北部翠鸟的基因组模式揭示了解释超级物种形成者悖论的机制框架。
Evol Lett. 2024 Jul 26;8(6):813-827. doi: 10.1093/evlett/qrae035. eCollection 2024 Dec.
9
Using eDNA to Supplement Population Genetic Analyses for Cryptic Marine Species: Identifying Population Boundaries for Alaska Harbour Porpoises.利用环境DNA补充隐秘海洋物种的种群遗传分析:确定阿拉斯加港湾鼠海豚的种群边界
Mol Ecol. 2025 Mar;34(5):e17563. doi: 10.1111/mec.17563. Epub 2024 Oct 25.
10
Comparative genomics reveal a novel phylotaxonomic order in the genus Fusobacterium.比较基因组学揭示了梭杆菌属中的一个新的系统发育分类阶元。
Commun Biol. 2024 Sep 7;7(1):1102. doi: 10.1038/s42003-024-06825-y.