Suppr超能文献

AdDeam:一种用于估计和聚类参考水平损伤概况的快速且可扩展的工具。

AdDeam: a fast and scalable tool for estimating and clustering reference-level damage profiles.

作者信息

Kraft Louis, Korneliussen Thorfinn Sand, Sackett Peter Wad, Renaud Gabriel

机构信息

Department of Health Technology, Section for Bioinformatics, Technical University of Denmark, Kongens Lyngby, 2800, Denmark.

Lundbeck Foundation GeoGenetics Centre, Globe Institute, University of Copenhagen, Copenhagen K, 1350, Denmark.

出版信息

Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf407.

Abstract

MOTIVATION

DNA damage patterns, such as increased frequencies of C→T and G→A substitutions at fragment ends, are widely used in ancient DNA studies to assess authenticity and detect contamination. In metagenomic studies, fragments can be mapped against multiple references or de novo assembled contigs to identify those likely to be ancient. Generating and comparing damage profiles, however, can be both tedious and time-consuming. Although tools exist for estimating damage in single reference genomes and metagenomic datasets, none efficiently cluster damage patterns.

RESULTS

To address this methodological gap, we developed AdDeam, a tool that combines rapid damage estimation with clustering for streamlined analyses and easy identification of potential contaminants or outliers. Our tool takes aligned ancient DNA (aDNA) fragments from various samples or contigs as input, computes damage patterns, clusters them, and outputs representative damage profiles per cluster, a probability of each sample pertaining to a cluster, as well as a Principal Component Analysis of the damage patterns for each sample for fast visualisation. We evaluated AdDeam on both simulated and empirical datasets. AdDeam effectively distinguishes different damage levels, such as uracil-DNA glycosylase-treated samples, sample-specific damages from specimens of different time periods, and can also distinguish between contigs containing modern or ancient fragments, providing a clear framework for aDNA authentication and facilitating large-scale analyses.

AVAILABILITY AND IMPLEMENTATION

AdDeam is publicly available at https://github.com/LouisPwr/AdDeam and can also be installed via Bioconda. It is implemented in Python and C++. All analysis scripts and datasets are available at https://github.com/LouisPwr/AdDeamAnalysis and on Zenodo under: 10.5281/zenodo.15052427.

摘要

动机

DNA损伤模式,例如片段末端C→T和G→A替换频率的增加,在古DNA研究中被广泛用于评估真实性和检测污染。在宏基因组学研究中,片段可以与多个参考序列比对或从头组装成重叠群,以识别可能是古代的片段。然而,生成和比较损伤图谱既繁琐又耗时。尽管存在用于估计单参考基因组和宏基因组数据集中损伤的工具,但没有一个能有效地对损伤模式进行聚类。

结果

为了填补这一方法学空白,我们开发了AdDeam,这是一种将快速损伤估计与聚类相结合的工具,用于简化分析并轻松识别潜在污染物或异常值。我们的工具将来自各种样本或重叠群的比对后的古DNA(aDNA)片段作为输入,计算损伤模式,对其进行聚类,并输出每个聚类的代表性损伤图谱、每个样本属于一个聚类的概率,以及每个样本损伤模式的主成分分析以便快速可视化。我们在模拟数据集和实证数据集上对AdDeam进行了评估。AdDeam能够有效区分不同的损伤水平,如尿嘧啶-DNA糖基化酶处理的样本、不同时间段样本的特异性损伤,还能区分包含现代或古代片段的重叠群,为aDNA鉴定提供了清晰的框架并便于大规模分析。

可用性与实现

AdDeam可在https://github.com/LouisPwr/AdDeam上公开获取,也可通过Bioconda安装。它是用Python和C++实现的。所有分析脚本和数据集可在https://github.com/LouisPwr/AdDeamAnalysis以及Zenodo上获取,链接为:10.5281/zenodo.15052427。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验