Suppr超能文献

利用 Glimpse 工具评估低覆盖度古 DNA 的基因型推断。

Evaluation of genotype imputation using Glimpse tools on low coverage ancient DNA.

机构信息

Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, 06100, Ankara, Turkey.

出版信息

Mamm Genome. 2024 Sep;35(3):461-473. doi: 10.1007/s00335-024-10053-4. Epub 2024 Jul 19.

Abstract

Ancient DNA provides a unique frame for directly studying human population genetics in time and space. Still, since most of the ancient genomic data is low coverage, analysis is confronted with a low number of SNPs, genotype uncertainties, and reference-bias. Here, we for the first time benchmark the two distinct versions of Glimpse tools on 120 ancient human genomes from Eurasia including those largely from previously under-evaluated regions and compare the performance of genotype imputation with de facto analysis approaches for low coverage genomic data analysis. We further investigate the impact of two distinct reference panels on imputation accuracy for low coverage genomic data. We compute accuracy statistics and perform PCA and f-statistics to explore the behaviour of genotype imputation on low coverage data regarding (i)two versions of Glimpse, (ii)two reference panels, (iii)four post-imputation filters and coverages, as well as (iv)data type and geographical origin of the samples on the analyses. Our results reveal that even for 0.1X coverage ancient human genomes, genotype imputation using Glimpse-v2 is suitable. Additionally, using the 1000 Genomes merged with Human Genome Diversity Panel improves the accuracy of imputation for the rare variants with low MAF, which might be important not only for ancient genomics but also for modern human genomic studies based on low coverage data and for haplotype-based analysis. Most importantly, we reveal that genotype imputation of low coverage ancient human genomes reduces the genetic affinity of the samples towards human reference genome. Through solving one of the most challenging biases in data analysis, so-called reference bias, genotype imputation using Glimpse v2 is promising for low coverage ancient human genomic data analysis and for rare-variant-based and haplotype-based analysis.

摘要

古 DNA 为直接研究人类在时间和空间上的群体遗传学提供了一个独特的框架。然而,由于大多数古基因组数据的覆盖率较低,分析面临着 SNP 数量少、基因型不确定性和参考偏差等问题。在这里,我们首次在来自欧亚大陆的 120 个人类古基因组数据集上对 Glimpse 工具的两个不同版本进行基准测试,这些数据集包括以前评估较少的地区,比较了基因型推断在低覆盖度基因组数据分析中的性能与实际分析方法。我们进一步研究了两个不同参考面板对低覆盖度基因组数据中基因型推断准确性的影响。我们计算了准确性统计数据,并进行了 PCA 和 f 统计分析,以探讨基因型推断在低覆盖度数据上的行为,包括:(i)Glimpse 的两个版本,(ii)两个参考面板,(iii)四个后推断过滤器和覆盖度,以及 (iv)样本的数据类型和地理来源对分析的影响。我们的结果表明,即使在 0.1X 覆盖度的古人类基因组中,使用 Glimpse-v2 进行基因型推断也是合适的。此外,使用 1000 基因组与人类基因组多样性面板的合并可以提高低 MAF 罕见变异的推断准确性,这不仅对古基因组学很重要,而且对基于低覆盖度数据的现代人类基因组研究和基于单倍型的分析也很重要。最重要的是,我们揭示了低覆盖度古人类基因组的基因型推断降低了样本对人类参考基因组的遗传亲和力。通过解决数据分析中最具挑战性的偏差之一,即所谓的参考偏差,使用 Glimpse v2 进行基因型推断有望应用于低覆盖度古人类基因组数据分析以及基于罕见变异和单倍型的分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验