• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用分层软聚类方法tangleGen推断血统。

Inferring ancestry with the hierarchical soft clustering approach tangleGen.

作者信息

Burger Klara Elisabeth, Klepper Solveig, von Luxburg Ulrike, Baumdicker Franz

机构信息

Department of Computer Science, University of Tübingen, 72074 Tübingen, Germany.

Tübingen AI Center, 72076 Tübingen, Germany.

出版信息

Genome Res. 2024 Dec 23;34(12):2244-2255. doi: 10.1101/gr.279399.124.

DOI:10.1101/gr.279399.124
PMID:39433440
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11694745/
Abstract

Understanding the genetic ancestry of populations is central to numerous scientific and societal fields. It contributes to a better understanding of human evolutionary history, advances personalized medicine, aids in forensic identification, and allows individuals to connect to their genealogical roots. Existing methods, such as ADMIXTURE, have significantly improved our ability to infer ancestries. However, these methods typically work with a fixed number of independent ancestral populations. As a result, they provide insight into genetic admixture, but do not include a hierarchical interpretation. In particular, the intricate ancestral population structures remain difficult to unravel. Alternative methods with a consistent inheritance structure, such as hierarchical clustering, may offer benefits in terms of interpreting the inferred ancestries. Here, we present tangleGen, a soft clustering tool that transfers the hierarchical machine learning framework Tangles, which leverages graph theoretical concepts, to the field of population genetics. The hierarchical perspective of tangleGen on the composition and structure of populations improves the interpretability of the inferred ancestral relationships. Moreover, tangleGen adds a new layer of explainability, as it allows identifying the single-nucleotide polymorphisms that are responsible for the clustering structure. We demonstrate the capabilities and benefits of tangleGen for the inference of ancestral relationships, using both simulated data and data from the 1000 Genomes Project.

摘要

了解人群的遗传谱系是众多科学和社会领域的核心。它有助于更好地理解人类进化历史,推动个性化医疗,辅助法医鉴定,并让个人能够追溯其族谱根源。现有的方法,如ADMIXTURE,显著提高了我们推断谱系的能力。然而,这些方法通常适用于固定数量的独立祖先群体。因此,它们能洞察遗传混合情况,但不包括分层解释。特别是,复杂的祖先群体结构仍然难以厘清。具有一致遗传结构的替代方法,如层次聚类,在解释推断出的谱系方面可能具有优势。在此,我们展示了tangleGen,这是一种软聚类工具,它将利用图论概念的层次机器学习框架Tangles应用于群体遗传学领域。tangleGen对群体组成和结构的层次视角提高了推断出的祖先关系的可解释性。此外,tangleGen增加了一层新的可解释性,因为它允许识别导致聚类结构的单核苷酸多态性。我们使用模拟数据和千人基因组计划的数据,展示了tangleGen在推断祖先关系方面的能力和优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/0818f55c8c8c/2244f06.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/18f82c7879a0/2244f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/f224e675f460/2244f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/bdad338666f8/2244f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/bc3da8e96a9d/2244f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/e36ad418a64b/2244f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/0818f55c8c8c/2244f06.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/18f82c7879a0/2244f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/f224e675f460/2244f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/bdad338666f8/2244f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/bc3da8e96a9d/2244f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/e36ad418a64b/2244f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/209d/11694745/0818f55c8c8c/2244f06.jpg

相似文献

1
Inferring ancestry with the hierarchical soft clustering approach tangleGen.使用分层软聚类方法tangleGen推断血统。
Genome Res. 2024 Dec 23;34(12):2244-2255. doi: 10.1101/gr.279399.124.
2
Inferring separate parental admixture components in unknown DNA samples using autosomal SNPs.利用常染色体 SNPs 推断未知 DNA 样本中的单独的父母混合成分。
Eur J Hum Genet. 2012 Dec;20(12):1283-9. doi: 10.1038/ejhg.2012.134. Epub 2012 Jun 27.
3
MI-MAAP: marker informativeness for multi-ancestry admixed populations.MI-MAAP:多祖混合人群的标记信息量。
BMC Bioinformatics. 2020 Apr 3;21(1):131. doi: 10.1186/s12859-020-3462-5.
4
StructHDP: automatic inference of number of clusters and population structure from admixed genotype data.StructHDP:从混合基因型数据中自动推断聚类数和群体结构。
Bioinformatics. 2011 Jul 1;27(13):i324-32. doi: 10.1093/bioinformatics/btr242.
5
The Promise of Inferring the Past Using the Ancestral Recombination Graph.利用祖先重组图谱推断过去的可能性。
Genome Biol Evol. 2024 Feb 1;16(2). doi: 10.1093/gbe/evae005.
6
Fine-Scale Inference of Ancestry Segments Without Prior Knowledge of Admixing Groups.无先验混群知识的精细尺度遗传片段推断
Genetics. 2019 Jul;212(3):869-889. doi: 10.1534/genetics.119.302139. Epub 2019 May 23.
7
A likelihood-based framework for demographic inference from genealogical trees.一种基于似然性的从系谱树进行人口统计学推断的框架。
Nat Genet. 2025 Apr;57(4):865-874. doi: 10.1038/s41588-025-02129-x. Epub 2025 Mar 20.
8
SHIPS: Spectral Hierarchical clustering for the Inference of Population Structure in genetic studies.SHIPS:遗传研究中用于推断群体结构的谱层次聚类。
PLoS One. 2012;7(10):e45685. doi: 10.1371/journal.pone.0045685. Epub 2012 Oct 12.
9
Massively parallel sequencing of 165 ancestry-informative SNPs and forensic biogeographical ancestry inference in three southern Chinese Sinitic/Tai-Kadai populations.对 165 个具有族群遗传信息的 SNP 进行大规模平行测序,并对中国南方三个汉藏语系/台语族群进行法医学生物地理族群推断。
Forensic Sci Int Genet. 2021 May;52:102475. doi: 10.1016/j.fsigen.2021.102475. Epub 2021 Feb 2.
10
iNJclust: Iterative Neighbor-Joining Tree Clustering Framework for Inferring Population Structure.iNJclust:用于推断种群结构的迭代邻接树聚类框架。
IEEE/ACM Trans Comput Biol Bioinform. 2014 Sep-Oct;11(5):903-14. doi: 10.1109/TCBB.2014.2322372.

引用本文的文献

1
On the use of generative models for evolutionary inference of malaria vectors from genomic data.关于使用生成模型从基因组数据进行疟疾病媒进化推断的研究
bioRxiv. 2025 Jun 27:2025.06.26.661760. doi: 10.1101/2025.06.26.661760.
2
Revealing the range of equally likely estimates in the admixture model.揭示混合模型中等可能性估计值的范围。
G3 (Bethesda). 2025 Aug 6;15(8). doi: 10.1093/g3journal/jkaf142.

本文引用的文献

1
Reconstructing complex admixture history using a hierarchical model.利用分层模型重建复杂的混合历史。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbad540.
2
Neural ADMIXTURE for rapid genomic clustering.用于快速基因组聚类的神经混合模型
Nat Comput Sci. 2023 Jul;3(7):621-629. doi: 10.1038/s43588-023-00482-7. Epub 2023 Jul 6.
3
Fast and accurate population admixture inference from genotype data from a few microsatellites to millions of SNPs.从少数微卫星到数百万个 SNPs 的基因型数据中快速准确地推断人群混合。
Heredity (Edinb). 2022 Aug;129(2):79-92. doi: 10.1038/s41437-022-00535-z. Epub 2022 May 4.
4
Inferring population structure in biobank-scale genomic data.推断生物库规模基因组数据中的群体结构。
Am J Hum Genet. 2022 Apr 7;109(4):727-737. doi: 10.1016/j.ajhg.2022.02.015. Epub 2022 Mar 16.
5
Efficient ancestry and mutation simulation with msprime 1.0.利用 msprime 1.0 进行高效的祖先和突变模拟。
Genetics. 2022 Mar 3;220(3). doi: 10.1093/genetics/iyab229.
6
Inference of recent admixture using genotype data.利用基因型数据推断近期混合情况。
Forensic Sci Int Genet. 2022 Jan;56:102593. doi: 10.1016/j.fsigen.2021.102593. Epub 2021 Sep 20.
7
Characterisation of a second gain of function EDAR variant, encoding EDAR380R, in East Asia.东亚人群中第二个具有功能获得性的 EDAR 变异体 EDAR380R 的特征分析。
Eur J Hum Genet. 2020 Dec;28(12):1694-1702. doi: 10.1038/s41431-020-0660-6. Epub 2020 Jun 4.
8
What is ancestry?什么是血统?
PLoS Genet. 2020 Mar 9;16(3):e1008624. doi: 10.1371/journal.pgen.1008624. eCollection 2020 Mar.
9
How to choose sets of ancestry informative markers: A supervised feature selection approach.如何选择一套亲缘信息标记物:一种有监督的特征选择方法。
Forensic Sci Int Genet. 2020 May;46:102259. doi: 10.1016/j.fsigen.2020.102259. Epub 2020 Feb 15.
10
A tutorial on how not to over-interpret STRUCTURE and ADMIXTURE bar plots.关于如何不过度解读 STRUCTURE 和 ADMIXTURE 条形图的教程。
Nat Commun. 2018 Aug 14;9(1):3258. doi: 10.1038/s41467-018-05257-7.