• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从非连锁分子标记推断遗传群体结构的算法比较。

Comparison of algorithms to infer genetic population structure from unlinked molecular markers.

作者信息

Peña-Malavera Andrea, Bruno Cecilia, Fernandez Elmer, Balzarini Monica

出版信息

Stat Appl Genet Mol Biol. 2014 Aug;13(4):391-402. doi: 10.1515/sagmb-2013-0006.

DOI:10.1515/sagmb-2013-0006
PMID:24964261
Abstract

Identifying population genetic structure (PGS) is crucial for breeding and conservation. Several clustering algorithms are available to identify the underlying PGS to be used with genetic data of maize genotypes. In this work, six methods to identify PGS from unlinked molecular marker data were compared using simulated and experimental data consisting of multilocus-biallelic genotypes. Datasets were delineated under different biological scenarios characterized by three levels of genetic divergence among populations (low, medium, and high FST) and two numbers of sub-populations (K=3 and K=5). The relative performance of hierarchical and non-hierarchical clustering, as well as model-based clustering (STRUCTURE) and clustering from neural networks (SOM-RP-Q). We use the clustering error rate of genotypes into discrete sub-populations as comparison criterion. In scenarios with great level of divergence among genotype groups all methods performed well. With moderate level of genetic divergence (FST=0.2), the algorithms SOM-RP-Q and STRUCTURE performed better than hierarchical and non-hierarchical clustering. In all simulated scenarios with low genetic divergence and in the experimental SNP maize panel (largely unlinked), SOM-RP-Q achieved the lowest clustering error rate. The SOM algorithm used here is more effective than other evaluated methods for sparse unlinked genetic data.

摘要

识别群体遗传结构(PGS)对于育种和保护至关重要。有几种聚类算法可用于识别潜在的PGS,以用于玉米基因型的遗传数据。在这项工作中,使用由多位点双等位基因基因型组成的模拟和实验数据,比较了六种从未连锁分子标记数据中识别PGS的方法。数据集是在不同的生物学场景下划定的,其特征是群体间遗传分化的三个水平(低、中、高FST)和两个亚群体数量(K = 3和K = 5)。比较了层次聚类和非层次聚类、基于模型的聚类(STRUCTURE)以及神经网络聚类(SOM-RP-Q)的相对性能。我们将基因型聚类到离散亚群体的聚类错误率作为比较标准。在基因型组间差异程度较大的情况下,所有方法都表现良好。在遗传分化程度中等(FST = 0.2)的情况下,SOM-RP-Q和STRUCTURE算法的表现优于层次聚类和非层次聚类。在所有低遗传分化的模拟场景以及实验性SNP玉米面板(大多未连锁)中,SOM-RP-Q实现了最低的聚类错误率。这里使用的SOM算法对于稀疏未连锁遗传数据比其他评估方法更有效。

相似文献

1
Comparison of algorithms to infer genetic population structure from unlinked molecular markers.从非连锁分子标记推断遗传群体结构的算法比较。
Stat Appl Genet Mol Biol. 2014 Aug;13(4):391-402. doi: 10.1515/sagmb-2013-0006.
2
A comparison of biallelic markers and microsatellites for the estimation of population and conservation genetic parameters in Atlantic salmon (Salmo salar).用于估计大西洋鲑(Salmo salar)种群和保护遗传参数的双等位基因标记与微卫星的比较
J Hered. 2007 Nov-Dec;98(7):692-704. doi: 10.1093/jhered/esm093. Epub 2007 Nov 5.
3
AMOVA-based clustering of population genetic data.基于 AMOVA 的群体遗传数据分析聚类。
J Hered. 2012 Sep-Oct;103(5):744-50. doi: 10.1093/jhered/ess047. Epub 2012 Aug 15.
4
Evaluation and comparison of gene clustering methods in microarray analysis.微阵列分析中基因聚类方法的评估与比较
Bioinformatics. 2006 Oct 1;22(19):2405-12. doi: 10.1093/bioinformatics/btl406. Epub 2006 Jul 31.
5
Population identification using genetic data.利用基因数据进行人群识别。
Annu Rev Genomics Hum Genet. 2012;13:337-61. doi: 10.1146/annurev-genom-082410-101510. Epub 2012 Jun 11.
6
Imputation of missing single nucleotide polymorphism genotypes using a multivariate mixed model framework.使用多元混合模型框架对缺失的单核苷酸多态性基因型进行推断。
J Anim Sci. 2011 Jul;89(7):2042-9. doi: 10.2527/jas.2010-3297. Epub 2011 Feb 25.
7
The effect of close relatives on unsupervised Bayesian clustering algorithms in population genetic structure analysis.近亲对群体遗传结构分析中无监督贝叶斯聚类算法的影响。
Mol Ecol Resour. 2012 Sep;12(5):873-84. doi: 10.1111/j.1755-0998.2012.03156.x. Epub 2012 May 28.
8
Clustering of gene expression data: performance and similarity analysis.基因表达数据的聚类:性能与相似性分析
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S19. doi: 10.1186/1471-2105-7-S4-S19.
9
A Bayesian clustering approach for detecting gene-gene interactions in high-dimensional genotype data.一种用于检测高维基因型数据中基因-基因相互作用的贝叶斯聚类方法。
Stat Appl Genet Mol Biol. 2014 Jun;13(3):275-97. doi: 10.1515/sagmb-2012-0074.
10
Microarray data clustering based on temporal variation: FCV with TSD preclustering.基于时间变化的微阵列数据聚类:采用TSD预聚类的FCV法
Appl Bioinformatics. 2003;2(1):35-45.

引用本文的文献

1
Sexual Reproductive Strategies as Resolved through Computational Methods Designed for Aneuploid Genomes.通过为非整倍体基因组设计的计算方法解决的生殖策略。
Genes (Basel). 2021 Jan 26;12(2):167. doi: 10.3390/genes12020167.
2
A Deep Learning Approach to Population Structure Inference in Inbred Lines of Maize.一种用于推断玉米自交系群体结构的深度学习方法。
Front Genet. 2020 Nov 24;11:543459. doi: 10.3389/fgene.2020.543459. eCollection 2020.
3
A transcriptomic study for identifying cardia- and non-cardia-specific gastric cancer prognostic factors using genetic algorithm-based methods.
基于遗传算法的方法进行转录组学研究,以鉴定贲门和非贲门特异性胃癌的预后因素。
J Cell Mol Med. 2020 Aug;24(16):9457-9465. doi: 10.1111/jcmm.15618. Epub 2020 Jul 10.
4
The superior fault tolerance of artificial neural network training with a fault/noise injection-based genetic algorithm.基于故障/噪声注入的遗传算法进行人工神经网络训练时具有卓越的容错能力。
Protein Cell. 2016 Oct;7(10):735-748. doi: 10.1007/s13238-016-0302-5. Epub 2016 Aug 9.
5
Genetic analysis of Indian tasar silkmoth (Antheraea mylitta) populations.印度柞蚕(柞蚕)种群的遗传分析。
Sci Rep. 2015 Oct 29;5:15728. doi: 10.1038/srep15728.