• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用D统计量对低覆盖率全基因组数据进行有力推断。

Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data.

作者信息

Soraggi Samuele, Wiuf Carsten, Albrechtsen Anders

机构信息

Department of Mathematical Sciences, Faculty of Science, University of Copenhagen, 2100, Denmark

Department of Mathematical Sciences, Faculty of Science, University of Copenhagen, 2100, Denmark.

出版信息

G3 (Bethesda). 2018 Feb 2;8(2):551-566. doi: 10.1534/g3.117.300192.

DOI:10.1534/g3.117.300192
PMID:29196497
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5919751/
Abstract

The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1-10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates.

摘要

检测人类群体间的古代基因流动是群体遗传学中的一个重要问题。检测古代混合事件的常用工具是D统计量。D统计量基于一种涉及四个群体的遗传关系假设,其正确性通过评估群体间等位基因的特定巧合来评估。在处理高通量测序数据时,准确地调用基因型并不总是可行的;因此,D统计量目前从每个群体中一个个体的读数中采样单个碱基。这意味着忽略了数据中的许多信息,在古代基因组的情况下,这个问题尤为突出。我们通过考虑每个群体中多个个体的所有读数,对克服D统计量的问题有了显著改进。我们还应用类型特异性错误校正来应对测序错误问题,并展示了一种校正来自不属于假定遗传关系的外部群体的基因渗入的方法,以及这如何导致混合率的估计。我们证明D统计量近似于标准正态分布。此外,我们表明我们的方法在检测混合方面优于传统的D统计量。在低和中等测序深度(1 - 10×)时,功率增益最为明显,在2×的测序深度下,性能与完美调用基因型时一样好。我们在具有模拟错误和古代数据的场景中展示了错误校正的可靠性,并在已知场景中校正基因渗入以估计混合率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/d9e7c9c39e5f/551f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/f1221c43a03f/551f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/8b4af4a26694/551f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/6f46942ba5d5/551f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/a7b3a7b30cde/551f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/e1475c5994ee/551f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/d9e7c9c39e5f/551f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/f1221c43a03f/551f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/8b4af4a26694/551f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/6f46942ba5d5/551f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/a7b3a7b30cde/551f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/e1475c5994ee/551f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a13e/5919751/d9e7c9c39e5f/551f6.jpg

相似文献

1
Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data.利用D统计量对低覆盖率全基因组数据进行有力推断。
G3 (Bethesda). 2018 Feb 2;8(2):551-566. doi: 10.1534/g3.117.300192.
2
Estimating individual admixture proportions from next generation sequencing data.从下一代测序数据估计个体混合比例。
Genetics. 2013 Nov;195(3):693-702. doi: 10.1534/genetics.113.154138. Epub 2013 Sep 11.
3
Testing for ancient admixture between closely related populations.检测亲缘关系较近的群体间的古老混合。
Mol Biol Evol. 2011 Aug;28(8):2239-52. doi: 10.1093/molbev/msr048. Epub 2011 Feb 15.
4
Comparative Performance of Popular Methods for Hybrid Detection using Genomic Data.基于基因组数据的混合检测常用方法的性能比较。
Syst Biol. 2021 Aug 11;70(5):891-907. doi: 10.1093/sysbio/syaa092.
5
Evaluating the use of ABBA-BABA statistics to locate introgressed loci.评估使用ABBA - BABA统计量来定位渐渗位点。
Mol Biol Evol. 2015 Jan;32(1):244-57. doi: 10.1093/molbev/msu269. Epub 2014 Sep 22.
6
SNP detection for massively parallel whole-genome resequencing.用于大规模平行全基因组重测序的单核苷酸多态性检测
Genome Res. 2009 Jun;19(6):1124-32. doi: 10.1101/gr.088013.108. Epub 2009 May 6.
7
Manifold learning for human population structure studies.用于人类群体结构研究的流形学习。
PLoS One. 2012;7(1):e29901. doi: 10.1371/journal.pone.0029901. Epub 2012 Jan 17.
8
Optimal sequencing depth design for whole genome re-sequencing in pigs.猪全基因组重测序的最佳测序深度设计。
BMC Bioinformatics. 2019 Nov 8;20(1):556. doi: 10.1186/s12859-019-3164-z.
9
Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data.从低深度下一代测序数据中计算 Tajima's D 和其他中性检验统计量。
BMC Bioinformatics. 2013 Oct 2;14:289. doi: 10.1186/1471-2105-14-289.
10
Tracking human population structure through time from whole genome sequences.从全基因组序列追踪随时间变化的人类人口结构。
PLoS Genet. 2020 Mar 9;16(3):e1008552. doi: 10.1371/journal.pgen.1008552. eCollection 2020 Mar.

引用本文的文献

1
Kazakh Tobet dogs in the genomic landscape: refining the history of livestock guardian breeds.哈萨克托贝特犬在基因组层面的情况:完善家畜守护犬品种的历史
BMC Biol. 2025 Aug 5;23(1):240. doi: 10.1186/s12915-025-02344-2.
2
The Nariño Cat, the Tigrinas and Their Problematic Systematics and Phylogeography: The Real Story.纳里尼奥猫、虎猫及其有问题的分类学和系统地理学:真实的故事。
Animals (Basel). 2025 Jun 26;15(13):1891. doi: 10.3390/ani15131891.
3
vcfgl: a flexible genotype likelihood simulator for VCF/BCF files.vcfgl:用于VCF/BCF文件的灵活基因型似然模拟器。

本文引用的文献

1
POPULATION GENETICS. Genomic evidence for the Pleistocene and recent population history of Native Americans.群体遗传学。美洲原住民更新世及近代群体历史的基因组证据。
Science. 2015 Aug 21;349(6250):aab3884. doi: 10.1126/science.aab3884. Epub 2015 Jul 21.
2
Genetic evidence for two founding populations of the Americas.美洲两个奠基人群体的遗传学证据。
Nature. 2015 Sep 3;525(7567):104-8. doi: 10.1038/nature14895. Epub 2015 Jul 21.
3
The genetic prehistory of the New World Arctic.新世界北极的遗传史前史。
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf098.
4
The genomic natural history of the aurochs.欧洲野牛的基因组自然史。
Nature. 2024 Nov;635(8037):136-141. doi: 10.1038/s41586-024-08112-6. Epub 2024 Oct 30.
5
The phylogenetic relationship and demographic history of rhesus macaques () in subtropical and temperate regions, China.中国亚热带和温带地区猕猴的系统发育关系和种群历史。
Ecol Evol. 2024 May 20;14(5):e11429. doi: 10.1002/ece3.11429. eCollection 2024 May.
6
Using genome-wide data to ascertain taxonomic status and assess population genetic structure for Houston toads (Bufo [= Anaxyrus] houstonensis).利用全基因组数据确定休斯顿蟾蜍(Bufo [= Anaxyrus] houstonensis)的分类地位并评估其种群遗传结构。
Sci Rep. 2024 Feb 8;14(1):3306. doi: 10.1038/s41598-024-53705-w.
7
Geographic Variation in Genomic Signals of Admixture Between Two Closely Related European Sepsid Fly Species.两种近缘欧洲蚤蝇物种间混合的基因组信号的地理变异
Evol Biol. 2023;50(4):395-412. doi: 10.1007/s11692-023-09612-5. Epub 2023 Aug 25.
8
Iron age genomic data from Althiburos - Tunisia renew the debate on the origins of African taurine cattle.来自突尼斯阿尔蒂布罗斯的铁器时代基因组数据重新引发了关于非洲牛起源的争论。
iScience. 2023 Jun 24;26(7):107196. doi: 10.1016/j.isci.2023.107196. eCollection 2023 Jul 21.
9
Past Connectivity but Recent Inbreeding in Cross River Gorillas Determined Using Whole Genomes from Single Hairs.利用单根毛发的全基因组信息确定过去的连通性但近期存在近亲繁殖的克罗斯河大猩猩。
Genes (Basel). 2023 Mar 18;14(3):743. doi: 10.3390/genes14030743.
10
Historic samples reveal loss of wild genotype through domestic chicken introgression during the Anthropocene.历史样本揭示了在人类世期间,通过家鸡的渗入导致野生基因型的丧失。
PLoS Genet. 2023 Jan 19;19(1):e1010551. doi: 10.1371/journal.pgen.1010551. eCollection 2023 Jan.
Science. 2014 Aug 29;345(6200):1255832. doi: 10.1126/science.1255832.
4
The genome of a Late Pleistocene human from a Clovis burial site in western Montana.蒙大拿州西部克洛维斯埋葬点的一位晚更新世人类的基因组。
Nature. 2014 Feb 13;506(7487):225-9. doi: 10.1038/nature13025.
5
Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans.上新世西伯利亚人基因组揭示了美洲原住民的双重祖先。
Nature. 2014 Jan 2;505(7481):87-91. doi: 10.1038/nature12736. Epub 2013 Nov 20.
6
Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse.利用早期中更新世马的基因组序列重新校准马的进化。
Nature. 2013 Jul 4;499(7456):74-8. doi: 10.1038/nature12323. Epub 2013 Jun 26.
7
Higher levels of neanderthal ancestry in East Asians than in Europeans.东亚人尼安德特人血统比例高于欧洲人。
Genetics. 2013 May;194(1):199-209. doi: 10.1534/genetics.112.148213. Epub 2013 Feb 14.
8
Inference of population splits and mixtures from genome-wide allele frequency data.从全基因组等位基因频率数据推断种群分裂和混合。
PLoS Genet. 2012;8(11):e1002967. doi: 10.1371/journal.pgen.1002967. Epub 2012 Nov 15.
9
Ancient admixture in human history.人类历史上的古老混合。
Genetics. 2012 Nov;192(3):1065-93. doi: 10.1534/genetics.112.145037. Epub 2012 Sep 7.
10
A high-coverage genome sequence from an archaic Denisovan individual.古丹尼索瓦人个体的高覆盖度基因组序列。
Science. 2012 Oct 12;338(6104):222-6. doi: 10.1126/science.1224344. Epub 2012 Aug 30.