• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种针对基因聚类问题的快速似然解。

A fast likelihood solution to the genetic clustering problem.

作者信息

Beugin Marie-Pauline, Gayet Thibault, Pontier Dominique, Devillard Sébastien, Jombart Thibaut

机构信息

Univ Lyon Laboratoire de Biométrie et Biologie Evolutive CNRS Université Claude Bernard Lyon 1 Villeurbanne France.

ANTAGENE, Animal Genomics Laboratory La Tour de Salvagny France.

出版信息

Methods Ecol Evol. 2018 Apr;9(4):1006-1016. doi: 10.1111/2041-210X.12968. Epub 2018 Jan 30.

DOI:10.1111/2041-210X.12968
PMID:29938015
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5993310/
Abstract

The investigation of genetic clusters in natural populations is an ubiquitous problem in a range of fields relying on the analysis of genetic data, such as molecular ecology, conservation biology and microbiology. Typically, genetic clusters are defined as distinct panmictic populations, or parental groups in the context of hybridisation. Two types of methods have been developed for identifying such clusters: model-based methods, which are usually computer-intensive but yield results which can be interpreted in the light of an explicit population genetic model, and geometric approaches, which are less interpretable but remarkably faster.Here, we introduce , a fast maximum-likelihood solution to the genetic clustering problem, which allies the advantages of both model-based and geometric approaches. Our method relies on maximising the likelihood of a fixed number of panmictic populations, using a combination of geometric approach and fast likelihood optimisation, using the Expectation-Maximisation (EM) algorithm. It can be used for assigning genotypes to populations and optionally identify various types of hybrids between two parental populations. Several goodness-of-fit statistics can also be used to guide the choice of the retained number of clusters.Using extensive simulations, we show that performs comparably to current gold standards for genetic clustering as well as hybrid detection, with some advantages for identifying hybrids after several backcrosses, while being orders of magnitude faster than other model-based methods. We also illustrate how can be used for identifying the optimal number of clusters, and subsequently assign individuals to various hybrid classes simulated from an empirical microsatellite dataset. is implemented in the package adegenet for the free software R, and is therefore easily integrated into existing pipelines for genetic data analysis. It can be applied to any kind of co-dominant markers, and can easily be extended to more complex models including, for instance, varying ploidy levels. Given its flexibility and computer-efficiency, it provides a useful complement to the existing toolbox for the study of genetic diversity in natural populations.

摘要

在一系列依赖遗传数据分析的领域中,如分子生态学、保护生物学和微生物学,对自然种群中基因簇的研究是一个普遍存在的问题。通常,基因簇被定义为不同的随机交配种群,或杂交背景下的亲本群体。已经开发出两种识别此类基因簇的方法:基于模型的方法,通常计算量较大,但产生的结果可以根据明确的群体遗传模型进行解释;几何方法,较难解释,但速度明显更快。在此,我们介绍一种针对基因聚类问题的快速最大似然解,它结合了基于模型的方法和几何方法的优点。我们的方法依赖于使用几何方法和快速似然优化(使用期望最大化(EM)算法)的组合来最大化固定数量随机交配种群的似然性。它可用于将基因型分配到种群,并可选择识别两个亲本群体之间的各种类型的杂种。还可以使用几种拟合优度统计量来指导所保留的簇数的选择。通过广泛的模拟,我们表明该方法在基因聚类以及杂种检测方面与当前的金标准表现相当,在识别多次回交后的杂种方面具有一些优势,同时比其他基于模型的方法快几个数量级。我们还说明了该方法如何用于识别最优的簇数,并随后将个体分配到从经验微卫星数据集中模拟的各种杂种类别。该方法在免费软件R的adegenet包中实现,因此很容易集成到现有的遗传数据分析管道中。它可以应用于任何类型的共显性标记,并且可以很容易地扩展到更复杂的模型,例如包括不同的倍性水平。鉴于其灵活性和计算效率,它为研究自然种群遗传多样性的现有工具箱提供了有用的补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/b392bba98b0d/MEE3-9-1006-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/345bd351da72/MEE3-9-1006-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/b9beb2476d6e/MEE3-9-1006-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/b392bba98b0d/MEE3-9-1006-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/345bd351da72/MEE3-9-1006-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/b9beb2476d6e/MEE3-9-1006-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/634a/5993310/b392bba98b0d/MEE3-9-1006-g003.jpg

相似文献

1
A fast likelihood solution to the genetic clustering problem.一种针对基因聚类问题的快速似然解。
Methods Ecol Evol. 2018 Apr;9(4):1006-1016. doi: 10.1111/2041-210X.12968. Epub 2018 Jan 30.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
An empirical comparison of population genetic analyses using microsatellite and SNP data for a species of conservation concern.利用微卫星和 SNP 数据对保护关注物种进行种群遗传分析的实证比较。
BMC Genomics. 2020 Jun 1;21(1):382. doi: 10.1186/s12864-020-06783-9.
4
Fast and accurate population admixture inference from genotype data from a few microsatellites to millions of SNPs.从少数微卫星到数百万个 SNPs 的基因型数据中快速准确地推断人群混合。
Heredity (Edinb). 2022 Aug;129(2):79-92. doi: 10.1038/s41437-022-00535-z. Epub 2022 May 4.
5
A Bayesian approach to the identification of panmictic populations and the assignment of individuals.一种用于识别随机交配群体和个体归属的贝叶斯方法。
Genet Res. 2001 Aug;78(1):59-77. doi: 10.1017/s001667230100502x.
6
Discriminant analysis of principal components: a new method for the analysis of genetically structured populations.主成分判别分析:一种用于分析遗传结构群体的新方法。
BMC Genet. 2010 Oct 15;11:94. doi: 10.1186/1471-2156-11-94.
7
[The use of the expectation-maximization (EM) algorithm for maximum likelihood estimation of gametic frequencies of multilocus polymorphic codominant systems based on sampled population data].[基于抽样群体数据,使用期望最大化(EM)算法对多位点共显性系统的配子频率进行最大似然估计]
Genetika. 2002 Mar;38(3):407-18.
8
FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.FLAME,一种用于分析DNA微阵列数据的新型模糊聚类方法。
BMC Bioinformatics. 2007 Jan 4;8:3. doi: 10.1186/1471-2105-8-3.
9
Resolving the structure of interactomes with hierarchical agglomerative clustering.利用层次凝聚聚类解析互作组学结构。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S44. doi: 10.1186/1471-2105-12-S1-S44.
10
Assessing population genetic structure via the maximisation of genetic distance.通过最大化遗传距离评估种群遗传结构。
Genet Sel Evol. 2009 Nov 9;41(1):49. doi: 10.1186/1297-9686-41-49.

引用本文的文献

1
Depth-structured lineages in the coral Stylophora pistillata of the Northern Red Sea.红海北部鹿角杯形珊瑚的深度结构谱系。
NPJ Biodivers. 2025 Apr 5;4(1):13. doi: 10.1038/s44185-025-00083-9.
2
Population genomics and connectivity of Vazella pourtalesii sponge grounds of the northwest Atlantic with conservation implications of deep sea vulnerable marine ecosystems.西北大西洋瓦尔泽拉·普尔塔莱西海绵礁群的种群基因组学与连通性及其对深海脆弱海洋生态系统的保护意义
Sci Rep. 2025 Jan 9;15(1):1540. doi: 10.1038/s41598-024-82462-z.
3
Exploring reported population differences in Norway lobster () in the Pomo Pits region of the Adriatic Sea using genome-wide markers.

本文引用的文献

1
apex: phylogenetics with multiple genes.顶点:多基因系统发育学
Mol Ecol Resour. 2017 Jan;17(1):19-26. doi: 10.1111/1755-0998.12567. Epub 2016 Aug 12.
2
stratag: An r package for manipulating, summarizing and analysing population genetic data.Stratag:一个用于处理、汇总和分析群体遗传数据的R软件包。
Mol Ecol Resour. 2017 Jan;17(1):5-11. doi: 10.1111/1755-0998.12559. Epub 2016 Jul 20.
3
fastSTRUCTURE: variational inference of population structure in large SNP data sets.fastSTRUCTURE:大型单核苷酸多态性(SNP)数据集中群体结构的变分推断
利用全基因组标记物探索在亚得里亚海波莫皮特斯地区报道的挪威海螯虾()种群差异。
PeerJ. 2024 Oct 21;12:e17852. doi: 10.7717/peerj.17852. eCollection 2024.
4
How many species are there? Lineage diversification and hidden speciation in Solanaceae from highland grasslands in southern South America.有多少个物种?南美洲南部高地草原茄科植物的谱系多样化与隐性物种形成。
Ann Bot. 2024 Dec 31;134(7):1291-1305. doi: 10.1093/aob/mcae144.
5
Niche modelling and landscape genetics of the yellow-legged hornet (): An integrative approach for evaluating central-marginal population dynamics in Europe.黄脚胡蜂的生态位建模与景观遗传学:一种评估欧洲中心边缘种群动态的综合方法
Ecol Evol. 2024 Jul 24;14(7):e70029. doi: 10.1002/ece3.70029. eCollection 2024 Jul.
6
Ecological divergence despite common mating sites: Genotypes and symbiotypes shed light on cryptic diversity in the black bean aphid species complex.尽管交配地点相同,但仍存在生态分歧:基因型和共生型揭示了黑瘤蚜种复合体中的隐存多样性。
Heredity (Edinb). 2024 Jun;132(6):320-330. doi: 10.1038/s41437-024-00687-0. Epub 2024 May 14.
7
Rare, long-distance dispersal underpins genetic connectivity in the pink sea fan, .罕见的长距离扩散是粉海扇基因连通性的基础。
Evol Appl. 2024 Mar 7;17(3):e13649. doi: 10.1111/eva.13649. eCollection 2024 Mar.
8
The global speciation continuum of the cyanobacterium Microcoleus.蓝藻微生物种的全球物种连续统。
Nat Commun. 2024 Mar 8;15(1):2122. doi: 10.1038/s41467-024-46459-6.
9
A reduced SNP panel optimised for non-invasive genetic assessment of a genetically impoverished conservation icon, the European bison.优化后的 SNP 基因面板,用于对遗传资源匮乏的保护象征——欧洲野牛进行非侵入性遗传评估。
Sci Rep. 2024 Jan 22;14(1):1875. doi: 10.1038/s41598-024-51495-9.
10
Testing the efficacy of different molecular tools for parasite conservation genetics: a case study using horsehair worms (Phylum: Nematomorpha).测试不同分子工具在寄生虫保护遗传学中的功效:以马毛蠕虫(线虫动物门)为例。
Parasitology. 2023 Aug;150(9):842-851. doi: 10.1017/S0031182023000641. Epub 2023 Jul 7.
Genetics. 2014 Jun;197(2):573-89. doi: 10.1534/genetics.114.164350. Epub 2014 Apr 2.
4
Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction.Poppr:用于具有克隆、部分克隆和/或有性繁殖的群体遗传分析的 R 包。
PeerJ. 2014 Mar 4;2:e281. doi: 10.7717/peerj.281. eCollection 2014.
5
The genetical structure of populations.种群的遗传结构。
Ann Eugen. 1951 Mar;15(4):323-54. doi: 10.1111/j.1469-1809.1949.tb02451.x.
6
Hierarchical and spatially explicit clustering of DNA sequences with BAPS software.使用 BAPS 软件对 DNA 序列进行层次化和空间显式聚类。
Mol Biol Evol. 2013 May;30(5):1224-8. doi: 10.1093/molbev/mst028. Epub 2013 Feb 13.
7
ape 3.0: New tools for distance-based phylogenetics and evolutionary analysis in R.ape 3.0:R 中用于基于距离的系统发生学和进化分析的新工具。
Bioinformatics. 2012 Jun 1;28(11):1536-7. doi: 10.1093/bioinformatics/bts184. Epub 2012 Apr 11.
8
adegenet 1.3-1: new tools for the analysis of genome-wide SNP data.adegenet 1.3-1:全基因组 SNP 数据分析的新工具。
Bioinformatics. 2011 Nov 1;27(21):3070-1. doi: 10.1093/bioinformatics/btr521. Epub 2011 Sep 16.
9
Inferring weak population structure with the assistance of sample group information.借助样本群组信息推断较弱的群体结构。
Mol Ecol Resour. 2009 Sep;9(5):1322-32. doi: 10.1111/j.1755-0998.2009.02591.x. Epub 2009 Apr 1.
10
Population genetics meets behavioral ecology.群体遗传学与行为生态学交汇。
Trends Ecol Evol. 1996 Aug;11(8):338-42. doi: 10.1016/0169-5347(96)20050-3.