Suppr超能文献

一种利用低深度 X 染色体数据估算古代男性样本中现代人类污染的似然方法。

A likelihood method for estimating present-day human contamination in ancient male samples using low-depth X-chromosome data.

机构信息

Department of Computational Biology, University of Lausanne, 1015 Lausanne, Switzerland.

Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland.

出版信息

Bioinformatics. 2020 Feb 1;36(3):828-841. doi: 10.1093/bioinformatics/btz660.

Abstract

MOTIVATION

The presence of present-day human contaminating DNA fragments is one of the challenges defining ancient DNA (aDNA) research. This is especially relevant to the ancient human DNA field where it is difficult to distinguish endogenous molecules from human contaminants due to their genetic similarity. Recently, with the advent of high-throughput sequencing and new aDNA protocols, hundreds of ancient human genomes have become available. Contamination in those genomes has been measured with computational methods often developed specifically for these empirical studies. Consequently, some of these methods have not been implemented and tested for general use while few are aimed at low-depth nuclear data, a common feature in aDNA datasets.

RESULTS

We develop a new X-chromosome-based maximum likelihood method for estimating present-day human contamination in low-depth sequencing data from male individuals. We implement our method for general use, assess its performance under conditions typical of ancient human DNA research, and compare it to previous nuclear data-based methods through extensive simulations. For low-depth data, we show that existing methods can produce unusable estimates or substantially underestimate contamination. In contrast, our method provides accurate estimates for a depth of coverage as low as 0.5× on the X-chromosome when contamination is below 25%. Moreover, our method still yields meaningful estimates in very challenging situations, i.e. when the contaminant and the target come from closely related populations or with increased error rates. With a running time below 5 min, our method is applicable to large scale aDNA genomic studies.

AVAILABILITY AND IMPLEMENTATION

The method is implemented in C++ and R and is available in github.com/sapfo/contaminationX and popgen.dk/angsd.

摘要

动机

现今人类污染 DNA 片段的存在是定义古代 DNA(aDNA)研究的挑战之一。这在古人类 DNA 领域尤为相关,由于其遗传相似性,很难区分内源性分子和人类污染物。最近,随着高通量测序和新的 aDNA 方案的出现,数百个人类古代基因组已经可用。这些基因组中的污染已经通过计算方法进行了测量,这些方法通常是为这些经验研究专门开发的。因此,其中一些方法尚未被实施和测试以供一般使用,而少数方法则针对低深度核数据,这是 aDNA 数据集的一个常见特征。

结果

我们开发了一种新的基于 X 染色体的最大似然方法,用于估计来自男性个体的低深度测序数据中现今人类的污染。我们为一般用途实现了我们的方法,评估了在典型的古人类 DNA 研究条件下的性能,并通过广泛的模拟将其与以前的基于核数据的方法进行了比较。对于低深度数据,我们表明现有的方法可能会产生不可用的估计值或大大低估污染。相比之下,当污染低于 25%时,我们的方法可以在 X 染色体的覆盖率低至 0.5×的情况下提供准确的估计值。此外,我们的方法在非常具有挑战性的情况下仍然可以产生有意义的估计值,即在污染物和目标来自密切相关的群体或具有更高错误率的情况下。我们的方法的运行时间低于 5 分钟,适用于大规模的 aDNA 基因组研究。

可用性和实施

该方法用 C++和 R 实现,并可在 github.com/sapfo/contaminationX 和 popgen.dk/angsd 中获得。

相似文献

7
The study of human Y chromosome variation through ancient DNA.通过古DNA对人类Y染色体变异的研究。
Hum Genet. 2017 May;136(5):529-546. doi: 10.1007/s00439-017-1773-z. Epub 2017 Mar 4.

引用本文的文献

9
The genomic history of the Aegean palatial civilizations.爱琴海宫殿文明的基因组历史。
Cell. 2021 May 13;184(10):2565-2586.e21. doi: 10.1016/j.cell.2021.03.039. Epub 2021 Apr 29.
10
Archaeogenomic distinctiveness of the Isthmo-Colombian area.伊斯地峡-哥伦比亚地区的古基因组独特性。
Cell. 2021 Apr 1;184(7):1706-1723.e24. doi: 10.1016/j.cell.2021.02.040. Epub 2021 Mar 23.

本文引用的文献

2
137 ancient human genomes from across the Eurasian steppes.来自欧亚草原的 137 个古人类基因组。
Nature. 2018 May;557(7705):369-374. doi: 10.1038/s41586-018-0094-2. Epub 2018 May 9.
4
gargammel: a sequence simulator for ancient DNA.加加麦尔:一种用于古代DNA的序列模拟器。
Bioinformatics. 2017 Feb 15;33(4):577-579. doi: 10.1093/bioinformatics/btw670.
7
Population genomics of Bronze Age Eurasia.青铜时代欧亚大陆的人口基因组学。
Nature. 2015 Jun 11;522(7555):167-72. doi: 10.1038/nature14507.
8
Reconstructing ancient genomes and epigenomes.重建古代基因组和表观基因组。
Nat Rev Genet. 2015 Jul;16(7):395-408. doi: 10.1038/nrg3935. Epub 2015 Jun 9.
9
Ancient genomics.古代基因组学。
Philos Trans R Soc Lond B Biol Sci. 2015 Jan 19;370(1660):20130387. doi: 10.1098/rstb.2013.0387.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验