• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Clumpak:一个用于识别聚类模式并整合K值范围内群体结构推断结果的程序。

Clumpak: a program for identifying clustering modes and packaging population structure inferences across K.

作者信息

Kopelman Naama M, Mayzel Jonathan, Jakobsson Mattias, Rosenberg Noah A, Mayrose Itay

机构信息

Department of Molecular Biology and Ecology of Plants, Tel Aviv University, Ramat Aviv, 69978, Israel.

Department of Evolutionary Biology and SciLife Lab, Uppsala University, Uppsala, 75236, Sweden.

出版信息

Mol Ecol Resour. 2015 Sep;15(5):1179-91. doi: 10.1111/1755-0998.12387. Epub 2015 Feb 27.

DOI:10.1111/1755-0998.12387
PMID:25684545
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4534335/
Abstract

The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population-genetic data analysis. Application of model-based clustering programs often entails a number of steps, in which the user considers different modelling assumptions, compares results across different predetermined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the postprocessing of results of model-based population structure analyses. For analysing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model-based analyses of population structure in population genetics and molecular ecology.

摘要

从多位点基因型数据识别群体的遗传结构已成为现代群体遗传学数据分析的核心组成部分。基于模型的聚类程序的应用通常需要多个步骤,在此过程中用户要考虑不同的建模假设,比较不同预设假定聚类数(通常用参数K表示)下的结果,检查每个固定K值的多次独立运行结果,并区分属于截然不同聚类解决方案的运行结果。在此,我们介绍Clumpak(跨K的聚类马尔可夫打包程序),这是一种可自动对基于模型的群体结构分析结果进行后处理的方法。对于在单个K值下分析多次独立运行结果,Clumpak可识别高度相似的运行结果集,分离代表可能解决方案空间中不同模式的不同运行结果组。此过程通过使用马尔可夫聚类算法为每个不同模式生成一个共识解决方案,该算法依赖于由软件Clumpp计算的重复运行之间的相似性矩阵。接下来,Clumpak可识别不同K值下推断聚类的最优比对,扩展了Clumpp中针对固定K值实施的类似方法,并简化了不同K值下聚类结果的比较。Clumpak还包含其他功能,如选择K值以及比较不同程序、模型或数据子集获得的解决方案的方法的实现。可在http://clumpak.tau.ac.il获取的Clumpak简化了群体遗传学和分子生态学中基于模型的群体结构分析的使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/6fddba5cc44c/nihms-664791-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/befb95440397/nihms-664791-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/dba1075be4ea/nihms-664791-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/77397f5b6818/nihms-664791-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/1169c4536827/nihms-664791-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/6fddba5cc44c/nihms-664791-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/befb95440397/nihms-664791-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/dba1075be4ea/nihms-664791-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/77397f5b6818/nihms-664791-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/1169c4536827/nihms-664791-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4395/4534335/6fddba5cc44c/nihms-664791-f0005.jpg

相似文献

1
Clumpak: a program for identifying clustering modes and packaging population structure inferences across K.Clumpak:一个用于识别聚类模式并整合K值范围内群体结构推断结果的程序。
Mol Ecol Resour. 2015 Sep;15(5):1179-91. doi: 10.1111/1755-0998.12387. Epub 2015 Feb 27.
2
CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure.CLUMPP:一个用于处理群体结构分析中标签切换和多模态问题的聚类匹配与置换程序。
Bioinformatics. 2007 Jul 15;23(14):1801-6. doi: 10.1093/bioinformatics/btm233. Epub 2007 May 7.
3
Clumppling: cluster matching and permutation program with integer linear programming.Clumppling:使用整数线性规划进行聚类匹配和排列程序。
Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btad751.
4
StructureSelector: A web-based software to select and visualize the optimal number of clusters using multiple methods.结构选择器:一个基于网络的软件,可使用多种方法选择和可视化最佳的聚类数量。
Mol Ecol Resour. 2018 Jan;18(1):176-177. doi: 10.1111/1755-0998.12719. Epub 2017 Oct 9.
5
Identifying the number of population clusters with structure: problems and solutions.识别具有结构的群体数量:问题与解决方案。
Mol Ecol Resour. 2016 May;16(3):601-3. doi: 10.1111/1755-0998.12521.
6
ADMIXPIPE: population analyses in ADMIXTURE for non-model organisms.ADMIXPIPE:非模式生物在 ADMIXTURE 中的群体分析。
BMC Bioinformatics. 2020 Jul 29;21(1):337. doi: 10.1186/s12859-020-03701-4.
7
The computer program structure for assigning individuals to populations: easy to use but easier to misuse.个体分配至群体的计算机程序结构:易于使用但更易被滥用。
Mol Ecol Resour. 2017 Sep;17(5):981-990. doi: 10.1111/1755-0998.12650. Epub 2017 Feb 7.
8
Individual identification from genetic marker data: developments and accuracy comparisons of methods.基于遗传标记数据的个体识别:方法的发展与准确性比较
Mol Ecol Resour. 2016 Jan;16(1):163-75. doi: 10.1111/1755-0998.12452. Epub 2015 Aug 20.
9
WebStruct and VisualStruct: Web interfaces and visualization for Structure software implemented in a cluster environment.WebStruct和VisualStruct:在集群环境中实现的Structure软件的Web界面与可视化工具。
J Integr Bioinform. 2008 Sep 24;5(1):89. doi: 10.2390/biecoll-jib-2008-89.
10
Structure_threader: An improved method for automation and parallelization of programs structure, fastStructure and MavericK on multicore CPU systems.结构线程器:一种改进的程序结构自动化和并行化方法,适用于多核 CPU 系统上的 fastStructure 和 MavericK。
Mol Ecol Resour. 2017 Nov;17(6):e268-e274. doi: 10.1111/1755-0998.12702. Epub 2017 Sep 16.

引用本文的文献

1
Uncovering the genetic landscape of soybean accessions from Kazakhstan in comparison with global germplasm using whole genome resequencing.利用全基因组重测序技术,揭示哈萨克斯坦大豆种质资源与全球种质资源相比的遗传图谱。
BMC Genomics. 2025 Sep 3;26(1):802. doi: 10.1186/s12864-025-12024-8.
2
Genetic Consequences of Tree Planting Versus Natural Colonisation: Implications for Afforestation Programmes in the United Kingdom.植树造林与自然定居的遗传后果:对英国造林计划的启示
Evol Appl. 2025 Aug 27;18(8):e70146. doi: 10.1111/eva.70146. eCollection 2025 Aug.
3
Population Structure and Genetic Diversity Among Shagya Arabian Horse Genealogical Lineages in Bulgaria Based on Microsatellite Genotyping.

本文引用的文献

1
Fast and efficient estimation of individual ancestry coefficients.个体祖先系数的快速高效估计。
Genetics. 2014 Apr;196(4):973-83. doi: 10.1534/genetics.113.160572. Epub 2014 Feb 4.
2
Analyses of genetic ancestry enable key insights for molecular ecology.遗传起源分析为分子生态学提供了重要的见解。
Mol Ecol. 2013 Nov;22(21):5278-94. doi: 10.1111/mec.12488. Epub 2013 Sep 19.
3
A DNA-based registry for all animal species: the barcode index number (BIN) system.基于 DNA 的所有动物物种登记系统:条形码索引编号(BIN)系统。
基于微卫星基因分型的保加利亚沙迦阿拉伯马谱系中的种群结构与遗传多样性
Vet Sci. 2025 Aug 19;12(8):776. doi: 10.3390/vetsci12080776.
4
Genetic Diversity and Population Structure of Nine Local Sheep Populations Bred in the Carpathia Area of Central Europe Revealed by Microsatellite Analysis.微卫星分析揭示中欧喀尔巴阡地区九个本地绵羊种群的遗传多样性和种群结构
Animals (Basel). 2025 Aug 15;15(16):2400. doi: 10.3390/ani15162400.
5
Consequences of interspecific plant hybridization on metabolic diversity in naturally occurring hybrid swarms.种间植物杂交对自然杂交群体中代谢多样性的影响。
Plant J. 2025 Aug;123(4):e70444. doi: 10.1111/tpj.70444.
6
Genomic analysis of differentiation and demography of the formerly conspecific agile (Dipodomys agilis) and Dulzura (D. simulans) kangaroo rats.对曾经同种的敏捷更格卢鼠(Dipodomys agilis)和杜尔祖拉更格卢鼠(D. simulans)的分化及种群统计学的基因组分析。
Heredity (Edinb). 2025 Aug 25. doi: 10.1038/s41437-025-00789-3.
7
Genome sequence analysis provides evidence that a boreal crustacean colonised Svalbard well before the ongoing Atlantification of the Arctic.基因组序列分析提供的证据表明,一种北方甲壳类动物在北极地区当前正在进行的大西洋化之前很久就已在斯瓦尔巴群岛定居。
Heredity (Edinb). 2025 Aug 23. doi: 10.1038/s41437-025-00793-7.
8
Chromosomal Inversion Associated With Diet Differences in Common Quails Sharing Wintering Grounds.与共享越冬地的普通鹌鹑饮食差异相关的染色体倒位
Ecol Evol. 2025 Aug 20;15(8):e71792. doi: 10.1002/ece3.71792. eCollection 2025 Aug.
9
Assessment of a microhaplotype panel for human identification and ancestry inference in Brazil.用于巴西人群身份识别和血统推断的微单倍型面板评估
Int J Legal Med. 2025 Aug 22. doi: 10.1007/s00414-025-03573-4.
10
A defined microbial community reproduces attributes of fine flavour chocolate fermentation.一个特定的微生物群落再现了优质风味巧克力发酵的特性。
Nat Microbiol. 2025 Aug 18. doi: 10.1038/s41564-025-02077-6.
PLoS One. 2013 Jul 8;8(7):e66213. doi: 10.1371/journal.pone.0066213. Print 2013.
4
Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation.大熊猫全基因组测序揭示了其种群历史和局部适应进化。
Nat Genet. 2013 Jan;45(1):67-71. doi: 10.1038/ng.2494. Epub 2012 Dec 16.
5
Recommendations for utilizing and reporting population genetic analyses: the reproducibility of genetic clustering using the program STRUCTURE.利用和报告群体遗传分析的建议:使用 STRUCTURE 程序进行遗传聚类的可重复性。
Mol Ecol. 2012 Oct;21(20):4925-30. doi: 10.1111/j.1365-294X.2012.05754.x. Epub 2012 Sep 24.
6
Microbial co-occurrence relationships in the human microbiome.人体微生物组中的微生物共同发生关系。
PLoS Comput Biol. 2012;8(7):e1002606. doi: 10.1371/journal.pcbi.1002606. Epub 2012 Jul 12.
7
Structurama: bayesian inference of population structure.Structurama:群体结构的贝叶斯推断。
Evol Bioinform Online. 2011;7:55-9. doi: 10.4137/EBO.S6761. Epub 2011 Jun 2.
8
Enhancements to the ADMIXTURE algorithm for individual ancestry estimation.ADMIXTURE 算法在个体血统估计中的改进。
BMC Bioinformatics. 2011 Jun 18;12:246. doi: 10.1186/1471-2105-12-246.
9
Inferring weak population structure with the assistance of sample group information.借助样本群组信息推断较弱的群体结构。
Mol Ecol Resour. 2009 Sep;9(5):1322-32. doi: 10.1111/j.1755-0998.2009.02591.x. Epub 2009 Apr 1.
10
Genomic microsatellites identify shared Jewish ancestry intermediate between Middle Eastern and European populations.基因组微卫星鉴定出了介于中东和欧洲人群之间的共享犹太祖先。
BMC Genet. 2009 Dec 8;10:80. doi: 10.1186/1471-2156-10-80.