• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ADMIXPIPE:非模式生物在 ADMIXTURE 中的群体分析。

ADMIXPIPE: population analyses in ADMIXTURE for non-model organisms.

机构信息

Department of Biological Sciences, University of Arkansas, Fayetteville, AR, 72701, USA.

Present address: Molecular Ecology Laboratory, Southwestern Native Aquatic Resources and Recovery Center (SNARRC), U.S. Fish & Wildlife Service, PO Box 219, Dexter, NM, 88230, USA.

出版信息

BMC Bioinformatics. 2020 Jul 29;21(1):337. doi: 10.1186/s12859-020-03701-4.

DOI:10.1186/s12859-020-03701-4
PMID:32727359
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7391514/
Abstract

BACKGROUND

Research on the molecular ecology of non-model organisms, while previously constrained, has now been greatly facilitated by the advent of reduced-representation sequencing protocols. However, tools that allow these large datasets to be efficiently parsed are often lacking, or if indeed available, then limited by the necessity of a comparable reference genome as an adjunct. This, of course, can be difficult when working with non-model organisms. Fortunately, pipelines are currently available that avoid this prerequisite, thus allowing data to be a priori parsed. An oft-used molecular ecology program (i.e., STRUCTURE), for example, is facilitated by such pipelines, yet they are surprisingly absent for a second program that is similarly popular and computationally more efficient (i.e., ADMIXTURE). The two programs differ in that ADMIXTURE employs a maximum-likelihood framework whereas STRUCTURE uses a Bayesian approach, yet both produce similar results. Given these issues, there is an overriding (and recognized) need among researchers in molecular ecology for bioinformatic software that will not only condense output from replicated ADMIXTURE runs, but also infer from these data the optimal number of population clusters (K).

RESULTS

Here we provide such a program (i.e., ADMIXPIPE) that (a) filters SNPs to allow the delineation of population structure in ADMIXTURE, then (b) parses the output for summarization and graphical representation via CLUMPAK. Our benchmarks effectively demonstrate how efficient the pipeline is for processing large, non-model datasets generated via double digest restriction-site associated DNA sequencing (ddRAD). Outputs not only parallel those from STRUCTURE, but also visualize the variation among individual ADMIXTURE runs, so as to facilitate selection of the most appropriate K-value.

CONCLUSIONS

ADMIXPIPE successfully integrates ADMIXTURE analysis with popular variant call format (VCF) filtering software to yield file types readily analyzed by CLUMPAK. Large population genomic datasets derived from non-model organisms are efficiently analyzed via the parallel-processing capabilities of ADMIXTURE. ADMIXPIPE is distributed under the GNU Public License and freely available for Mac OSX and Linux platforms at: https://github.com/stevemussmann/admixturePipeline .

摘要

背景

非模式生物的分子生态学研究虽然以前受到限制,但现在随着简化代表性测序协议的出现,已经得到了极大的促进。然而,允许有效解析这些大型数据集的工具往往缺乏,或者即使确实存在,也受到需要类似参考基因组作为辅助的限制。当然,在处理非模式生物时,这可能很困难。幸运的是,目前有一些管道可以避免这个前提条件,从而允许数据进行先验解析。例如,一个常用的分子生态学程序(即 STRUCTURE)就可以通过这些管道来实现,但是对于另一个同样受欢迎且计算效率更高的程序(即 ADMIXTURE),却没有类似的工具。这两个程序的区别在于 ADMIXTURE 采用最大似然框架,而 STRUCTURE 采用贝叶斯方法,但它们都产生类似的结果。鉴于这些问题,分子生态学研究人员迫切需要一种生物信息学软件,不仅可以压缩 ADMIXTURE 重复运行的输出,还可以从这些数据中推断出最佳的群体聚类数(K)。

结果

在这里,我们提供了这样一个程序(即 ADMIXPIPE),它(a)过滤 SNP 以允许在 ADMIXTURE 中描绘群体结构,然后(b)通过 CLUMPAK 解析输出以进行总结和图形表示。我们的基准测试有效地证明了该管道处理通过双消化限制位点相关 DNA 测序(ddRAD)生成的大型非模式数据集的效率。输出不仅与 STRUCTURE 的输出平行,还可视化了个体 ADMIXTURE 运行之间的差异,从而便于选择最合适的 K 值。

结论

ADMIXPIPE 成功地将 ADMIXTURE 分析与流行的变体调用格式(VCF)过滤软件集成在一起,生成易于通过 CLUMPAK 分析的文件类型。通过 ADMIXTURE 的并行处理能力,高效地分析了来自非模式生物的大型群体基因组数据集。ADMIXPIPE 是根据 GNU 公共许可证分发的,可以在 Mac OSX 和 Linux 平台上从以下网址免费获得:https://github.com/stevemussmann/admixturePipeline 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5f/7391514/f2b8d3e95cbb/12859_2020_3701_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5f/7391514/71f57860c6b6/12859_2020_3701_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5f/7391514/f2b8d3e95cbb/12859_2020_3701_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5f/7391514/71f57860c6b6/12859_2020_3701_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9f5f/7391514/f2b8d3e95cbb/12859_2020_3701_Fig2_HTML.jpg

相似文献

1
ADMIXPIPE: population analyses in ADMIXTURE for non-model organisms.ADMIXPIPE:非模式生物在 ADMIXTURE 中的群体分析。
BMC Bioinformatics. 2020 Jul 29;21(1):337. doi: 10.1186/s12859-020-03701-4.
2
AdmixPipe v3: facilitating population structure delimitation from SNP data.AdmixPipe v3:助力从单核苷酸多态性数据中划定群体结构。
Bioinform Adv. 2023 Nov 23;3(1):vbad168. doi: 10.1093/bioadv/vbad168. eCollection 2023.
3
SNP-SVant: A Computational Workflow to Predict and Annotate Genomic Variants in Organisms Lacking Benchmarked Variants.SNP-SVant:一种在缺乏基准变异的生物中预测和注释基因组变异的计算工作流程。
Curr Protoc. 2024 May;4(5):e1046. doi: 10.1002/cpz1.1046.
4
From reference genomes to population genomics: comparing three reference-aligned reduced-representation sequencing pipelines in two wildlife species.从参考基因组到群体基因组学:比较两种野生动物中三种基于参考的简化代表性测序方法。
BMC Genomics. 2019 Jun 3;20(1):453. doi: 10.1186/s12864-019-5806-y.
5
PhyInformR: phylogenetic experimental design and phylogenomic data exploration in R.PhyInformR:R语言中的系统发育实验设计与系统发育基因组数据探索
BMC Evol Biol. 2016 Dec 1;16(1):262. doi: 10.1186/s12862-016-0837-3.
6
Clumpak: a program for identifying clustering modes and packaging population structure inferences across K.Clumpak:一个用于识别聚类模式并整合K值范围内群体结构推断结果的程序。
Mol Ecol Resour. 2015 Sep;15(5):1179-91. doi: 10.1111/1755-0998.12387. Epub 2015 Feb 27.
7
4Pipe4--A 454 data analysis pipeline for SNP detection in datasets with no reference sequence or strain information.4Pipe4——一种用于在没有参考序列或菌株信息的数据集中检测单核苷酸多态性的454数据分析流程。
BMC Bioinformatics. 2016 Jan 19;17:41. doi: 10.1186/s12859-016-0892-1.
8
Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations.BAPS软件中用于学习群体遗传结构的增强贝叶斯建模。
BMC Bioinformatics. 2008 Dec 16;9:539. doi: 10.1186/1471-2105-9-539.
9
scalepopgen: Bioinformatic Workflow Resources Implemented in Nextflow for Comprehensive Population Genomic Analyses.scalepopgen:在 Nextflow 中实现的用于全面群体基因组分析的生物信息学工作流程资源。
Mol Biol Evol. 2024 Apr 2;41(4). doi: 10.1093/molbev/msae057.
10
dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms.dDocent:一种 RADseq 变体调用管道,专为非模式生物的群体基因组学设计。
PeerJ. 2014 Jun 10;2:e431. doi: 10.7717/peerj.431. eCollection 2014.

引用本文的文献

1
Unraveling the genetic basis of maize ear diameter in a multi-parent RIL population derived from tropical and temperate germplasms.解析源自热带和温带种质的多亲本重组自交系群体中玉米穗直径的遗传基础。
Theor Appl Genet. 2025 Jul 9;138(8):181. doi: 10.1007/s00122-025-04964-2.
2
Host-Associated Genetic Differentiation in the Face of Ongoing Gene Flow: Ecological Speciation in a Pathogenic Parasite of Freshwater Fish.在持续基因流情况下宿主相关的遗传分化:淡水鱼致病寄生虫中的生态物种形成
Mol Biol Evol. 2025 Jul 1;42(7). doi: 10.1093/molbev/msaf163.
3
Genome-wide association study of Northern corn leaf blight (NCLB) resistance using temperate and subtropical maize recombinant inbred lines.

本文引用的文献

1
ipyrad: Interactive assembly and analysis of RADseq datasets.ipyrad:RADseq 数据集的交互式组装和分析。
Bioinformatics. 2020 Apr 15;36(8):2592-2594. doi: 10.1093/bioinformatics/btz966.
2
Stacks 2: Analytical methods for paired-end sequencing improve RADseq-based population genomics.Stacks 2:用于双端测序的分析方法改进了基于 RADseq 的群体基因组学。
Mol Ecol. 2019 Nov;28(21):4737-4754. doi: 10.1111/mec.15253. Epub 2019 Oct 17.
3
Hybridization drives genetic erosion in sympatric desert fishes of western North America.杂交导致北美西部同域沙漠鱼类遗传侵蚀。
利用温带和亚热带玉米重组自交系对玉米大斑病抗性进行全基因组关联研究。
BMC Genomics. 2025 Jul 1;26(1):609. doi: 10.1186/s12864-025-11806-4.
4
Genetic dissection of hundred-kernel weight through combined genome-wide association study and linkage analysis in tropical maize.通过全基因组关联研究与连锁分析相结合对热带玉米百粒重进行遗传剖析
BMC Genomics. 2025 May 16;26(1):496. doi: 10.1186/s12864-025-11682-y.
5
Chromosome-scale assemblies of three Ormosia species: repetitive sequences distribution and structural rearrangement.三种红豆属植物的染色体水平组装:重复序列分布与结构重排
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf047.
6
Genomic and Physiological Basis of Structural and Foliar Trait Variation in Tropical Species : Implications for Restoration in Future Drier Climates.热带物种结构和叶片性状变异的基因组及生理基础:对未来更干旱气候下恢复工作的启示
Evol Appl. 2025 Apr 28;18(5):e70102. doi: 10.1111/eva.70102. eCollection 2025 May.
7
Parallel and convergent evolution in genes underlying seasonal migration.季节性迁徙相关基因中的平行进化和趋同进化
Evol Lett. 2024 Nov 30;9(2):189-208. doi: 10.1093/evlett/qrae064. eCollection 2025 Apr.
8
Genome-wide association study of salicylic acid provides genetic insights for tea plant selective breeding.水杨酸的全基因组关联研究为茶树选择育种提供了遗传学见解。
Hortic Res. 2025 Jan 2;12(4):uhae362. doi: 10.1093/hr/uhae362. eCollection 2025 Apr.
9
Novel candidate genes and genetic basis analysis of kernel starch content in tropical maize.热带玉米籽粒淀粉含量的新候选基因及遗传基础分析
BMC Plant Biol. 2025 Jan 24;25(1):105. doi: 10.1186/s12870-025-06125-5.
10
Discovery of candidate genes enhancing kernel protein content in tropical maize introgression lines.热带玉米导入系中提高核蛋白含量的候选基因的发现。
BMC Plant Biol. 2024 Nov 22;24(1):1110. doi: 10.1186/s12870-024-05836-5.
Heredity (Edinb). 2019 Dec;123(6):759-773. doi: 10.1038/s41437-019-0259-2. Epub 2019 Aug 20.
4
Selecting RAD-Seq Data Analysis Parameters for Population Genetics: The More the Better?为群体遗传学选择RAD-Seq数据分析参数:越多越好?
Front Genet. 2019 May 29;10:533. doi: 10.3389/fgene.2019.00533. eCollection 2019.
5
Minor allele frequency thresholds strongly affect population structure inference with genomic data sets.等位基因频率阈值会强烈影响基因组数据集的群体结构推断。
Mol Ecol Resour. 2019 May;19(3):639-647. doi: 10.1111/1755-0998.12995.
6
These aren't the loci you'e looking for: Principles of effective SNP filtering for molecular ecologists.这些不是你要找的位点:分子生态学家有效进行单核苷酸多态性(SNP)筛选的原则。
Mol Ecol. 2018 Jul 10. doi: 10.1111/mec.14792.
7
The K = 2 conundrum.K = 2的难题。
Mol Ecol. 2017 Jul;26(14):3594-3602. doi: 10.1111/mec.14187. Epub 2017 Jun 14.
8
StrAuto: automation and parallelization of STRUCTURE analysis.StrAuto:STRUCTURE分析的自动化与并行化
BMC Bioinformatics. 2017 Mar 24;18(1):192. doi: 10.1186/s12859-017-1593-0.
9
pong: fast analysis and visualization of latent clusters in population genetic data.Pong:群体遗传数据中潜在聚类的快速分析与可视化
Bioinformatics. 2016 Sep 15;32(18):2817-23. doi: 10.1093/bioinformatics/btw327. Epub 2016 Jun 9.
10
Conservation genomics of natural and managed populations: building a conceptual and practical framework.自然种群和管理种群的保护基因组学:构建概念与实践框架
Mol Ecol. 2016 Jul;25(13):2967-77. doi: 10.1111/mec.13647. Epub 2016 May 18.