古代人类核DNA污染、误差和人口统计学的联合估计

Joint Estimation of Contamination, Error and Demography for Nuclear DNA from Ancient Humans.

作者信息

Racimo Fernando, Renaud Gabriel, Slatkin Montgomery

机构信息

Department of Integrative Biology, University of California, Berkeley, Berkeley, California, United States of America.

Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.

出版信息

PLoS Genet. 2016 Apr 6;12(4):e1005972. doi: 10.1371/journal.pgen.1005972. eCollection 2016 Apr.

DOI:10.1371/journal.pgen.1005972

PMID:27049965

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4822957/

Abstract

When sequencing an ancient DNA sample from a hominin fossil, DNA from present-day humans involved in excavation and extraction will be sequenced along with the endogenous material. This type of contamination is problematic for downstream analyses as it will introduce a bias towards the population of the contaminating individual(s). Quantifying the extent of contamination is a crucial step as it allows researchers to account for possible biases that may arise in downstream genetic analyses. Here, we present an MCMC algorithm to co-estimate the contamination rate, sequencing error rate and demographic parameters-including drift times and admixture rates-for an ancient nuclear genome obtained from human remains, when the putative contaminating DNA comes from present-day humans. We assume we have a large panel representing the putative contaminant population (e.g. European, East Asian or African). The method is implemented in a C++ program called 'Demographic Inference with Contamination and Error' (DICE). We applied it to simulations and genome data from ancient Neanderthals and modern humans. With reasonable levels of genome sequence coverage (>3X), we find we can recover accurate estimates of all these parameters, even when the contamination rate is as high as 50%.

摘要

在对古人类化石的古代DNA样本进行测序时，参与挖掘和提取工作的现代人类的DNA会与内源物质一起被测序。这种污染类型对于下游分析来说是个问题，因为它会导致偏向污染个体群体的偏差。量化污染程度是关键的一步，因为这能让研究人员考虑到下游基因分析中可能出现的偏差。在此，我们提出一种马尔可夫链蒙特卡罗（MCMC）算法，用于共同估计从人类遗骸获得的古代核基因组的污染率、测序错误率和人口统计学参数（包括漂变时间和混合率），假定污染DNA来自现代人类。我们假设我们有一个代表假定污染群体（如欧洲人、东亚人或非洲人）的大样本。该方法在一个名为“考虑污染和错误的人口统计学推断”（DICE）的C++程序中实现。我们将其应用于古代尼安德特人和现代人类的模拟数据及基因组数据。在基因组序列覆盖度达到合理水平（>3X）时，我们发现即使污染率高达50%，我们也能够准确估计所有这些参数。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e3f/4822957/4e5f9104b0ac/pgen.1005972.g005.jpg

相似文献

Joint Estimation of Contamination, Error and Demography for Nuclear DNA from Ancient Humans.古代人类核DNA污染、误差和人口统计学的联合估计

PLoS Genet. 2016 Apr 6;12(4):e1005972. doi: 10.1371/journal.pgen.1005972. eCollection 2016 Apr.

Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA.Schmutzi：古代DNA污染估计及内源性线粒体一致性判定

Genome Biol. 2015 Oct 12;16:224. doi: 10.1186/s13059-015-0776-0.

Archaic human genomics.古人类基因组学。

Am J Phys Anthropol. 2012;149 Suppl 55:24-39. doi: 10.1002/ajpa.22159. Epub 2012 Nov 2.

Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal.从西伯利亚尼安德特人的现代污染中分离内源性古代 DNA。

Proc Natl Acad Sci U S A. 2014 Feb 11;111(6):2229-34. doi: 10.1073/pnas.1318934111. Epub 2014 Jan 27.

Ancient DNA and human history.古代DNA与人类历史。

Proc Natl Acad Sci U S A. 2016 Jun 7;113(23):6380-7. doi: 10.1073/pnas.1524306113. Epub 2016 Jun 6.

Joint Estimation of Relatedness Coefficients and Allele Frequencies from Ancient Samples.从古代样本中联合估计亲缘系数和等位基因频率

Genetics. 2017 Jun;206(2):1025-1035. doi: 10.1534/genetics.117.200600. Epub 2017 Apr 10.

Selective enrichment of damaged DNA molecules for ancient genome sequencing.用于古代基因组测序的受损DNA分子的选择性富集

Genome Res. 2014 Sep;24(9):1543-9. doi: 10.1101/gr.174201.114. Epub 2014 Jul 31.

Statistical methods for analyzing ancient DNA from hominins.分析古人类DNA的统计方法。

Curr Opin Genet Dev. 2016 Dec;41:72-76. doi: 10.1016/j.gde.2016.08.004. Epub 2016 Sep 5.

Neanderthal and Denisova genetic affinities with contemporary humans: introgression versus common ancestral polymorphisms.尼安德特人和丹尼索瓦人与当代人类的遗传亲和力：基因渗入与共同祖先多态性。

Gene. 2013 Nov 1;530(1):83-94. doi: 10.1016/j.gene.2013.06.005. Epub 2013 Jul 19.

Age estimates for hominin fossils and the onset of the Upper Palaeolithic at Denisova Cave.丹尼索瓦洞古人类化石的年代测定与旧石器时代晚期的开始。

Nature. 2019 Jan;565(7741):640-644. doi: 10.1038/s41586-018-0870-z. Epub 2019 Jan 30.

引用本文的文献

Optimized in-solution enrichment of over a million ancient human SNPs.超过一百万个古代人类单核苷酸多态性的优化溶液内富集。

Genome Biol. 2025 Jul 3;26(1):190. doi: 10.1186/s13059-025-03622-6.

Accounting for population structure and data quality in demographic inference with linkage disequilibrium methods.在使用连锁不平衡方法进行人口统计学推断时考虑人口结构和数据质量。

Nat Commun. 2025 Jul 1;16(1):6054. doi: 10.1038/s41467-025-61378-w.

hapCon: estimating contamination of ancient genomes by copying from reference haplotypes.hapCon：通过从参考单倍型复制来估计古代基因组的污染。

Bioinformatics. 2022 Aug 2;38(15):3768-3777. doi: 10.1093/bioinformatics/btac390.

A Minimally Morphologically Destructive Approach for DNA Retrieval and Whole-Genome Shotgun Sequencing of Pinned Historic Dipteran Vector Species.一种对针插历史双翅目媒介物种的 DNA 提取和全基因组鸟枪法测序具有最小形态破坏性的方法。

Genome Biol Evol. 2021 Oct 1;13(10). doi: 10.1093/gbe/evab226.

Whole-Genome Sequencing of a 900-Year-Old Human Skeleton Supports Two Past Migration Events from the Russian Far East to Northern Japan.对一具 900 岁人类骨骼的全基因组测序支持了两次从俄罗斯远东到日本北部的过去的迁徙事件。

Genome Biol Evol. 2021 Sep 1;13(9). doi: 10.1093/gbe/evab192.

AuthentiCT: a model of ancient DNA damage to estimate the proportion of present-day DNA contamination.AuthentiCT：一种用于估计现代 DNA 污染比例的古 DNA 损伤模型。

Genome Biol. 2020 Sep 15;21(1):246. doi: 10.1186/s13059-020-02123-y.

Detecting and Quantifying Natural Selection at Two Linked Loci from Time Series Data of Allele Frequencies with Forward-in-Time Simulations.利用向前时间模拟从等位基因频率时间序列数据中检测和量化两个连锁位点的自然选择。

Genetics. 2020 Oct;216(2):521-541. doi: 10.1534/genetics.120.303463. Epub 2020 Aug 21.

ContamLD: estimation of ancient nuclear DNA contamination using breakdown of linkage disequilibrium.ContamLD：利用连锁不平衡的破坏估计古代核 DNA 污染。

Genome Biol. 2020 Aug 10;21(1):199. doi: 10.1186/s13059-020-02111-2.

Inference of natural selection from ancient DNA.从古代DNA推断自然选择

Evol Lett. 2020 Mar 18;4(2):94-108. doi: 10.1002/evl3.165. eCollection 2020 Apr.

A likelihood method for estimating present-day human contamination in ancient male samples using low-depth X-chromosome data.一种利用低深度 X 染色体数据估算古代男性样本中现代人类污染的似然方法。

Bioinformatics. 2020 Feb 1;36(3):828-841. doi: 10.1093/bioinformatics/btz660.

本文引用的文献

Efficient computation of the joint sample frequency spectra for multiple populations.多群体联合样本频率谱的高效计算。

J Comput Graph Stat. 2017;26(1):182-194. doi: 10.1080/10618600.2016.1159212. Epub 2017 Feb 16.

Nuclear and mitochondrial DNA sequences from two Denisovan individuals.来自两名丹尼索瓦人的核DNA和线粒体DNA序列。

Proc Natl Acad Sci U S A. 2015 Dec 22;112(51):15696-700. doi: 10.1073/pnas.1519905112. Epub 2015 Nov 16.

Schmutzi: estimation of contamination and endogenous mitochondrial consensus calling for ancient DNA.Schmutzi：古代DNA污染估计及内源性线粒体一致性判定

Genome Biol. 2015 Oct 12;16:224. doi: 10.1186/s13059-015-0776-0.

A global reference for human genetic variation.人类遗传变异的全球参考。

Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.

Massive migration from the steppe was a source for Indo-European languages in Europe.来自草原的大规模迁徙是欧洲印欧语系的一个源头。

Nature. 2015 Jun 11;522(7555):207-11. doi: 10.1038/nature14317. Epub 2015 Mar 2.

The UCSC Genome Browser database: 2015 update.加州大学圣克鲁兹分校基因组浏览器数据库：2015年更新

Nucleic Acids Res. 2015 Jan;43(Database issue):D670-81. doi: 10.1093/nar/gku1177. Epub 2014 Nov 26.

ANGSD: Analysis of Next Generation Sequencing Data.ANGSD：下一代测序数据分析

BMC Bioinformatics. 2014 Nov 25;15(1):356. doi: 10.1186/s12859-014-0356-4.

Paleogenomics. Genomic structure in Europeans dating back at least 36,200 years.古基因组学。追溯到至少 36200 年前的欧洲人基因组结构。

Science. 2014 Nov 28;346(6213):1113-8. doi: 10.1126/science.aaa0114. Epub 2014 Nov 6.

Genome sequence of a 45,000-year-old modern human from western Siberia.来自西西伯利亚的一名生活在45000年前的现代人的基因组序列。

Nature. 2014 Oct 23;514(7523):445-9. doi: 10.1038/nature13810.

Ancient human genomes suggest three ancestral populations for present-day Europeans.古代人类基因组表明当今欧洲人有三个祖先群体。

Nature. 2014 Sep 18;513(7518):409-13. doi: 10.1038/nature13673.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

古代人类核DNA污染、误差和人口统计学的联合估计

Joint Estimation of Contamination, Error and Demography for Nuclear DNA from Ancient Humans.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献