Suppr超能文献

古代样本比对参数的测试:使用TAPAS工具评估和优化古代样本的映射参数

Testing of Alignment Parameters for Ancient Samples: Evaluating and Optimizing Mapping Parameters for Ancient Samples Using the TAPAS Tool.

作者信息

Taron Ulrike H, Lell Moritz, Barlow Axel, Paijmans Johanna L A

机构信息

Institute for Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476 Potsdam, Germany.

出版信息

Genes (Basel). 2018 Mar 13;9(3):157. doi: 10.3390/genes9030157.

Abstract

High-throughput sequence data retrieved from ancient or other degraded samples has led to unprecedented insights into the evolutionary history of many species, but the analysis of such sequences also poses specific computational challenges. The most commonly used approach involves mapping sequence reads to a reference genome. However, this process becomes increasingly challenging with an elevated genetic distance between target and reference or with the presence of contaminant sequences with high sequence similarity to the target species. The evaluation and testing of mapping efficiency and stringency are thus paramount for the reliable identification and analysis of ancient sequences. In this paper, we present 'TAPAS', (Testing of Alignment Parameters for Ancient Samples), a computational tool that enables the systematic testing of mapping tools for ancient data by simulating sequence data reflecting the properties of an ancient dataset and performing test runs using the mapping software and parameter settings of interest. We showcase TAPAS by using it to assess and improve mapping strategy for a degraded sample from a banded linsang (), for which no closely related reference is currently available. This enables a 1.8-fold increase of the number of mapped reads without sacrificing mapping specificity. The increase of mapped reads effectively reduces the need for additional sequencing, thus making more economical use of time, resources, and sample material.

摘要

从古代样本或其他降解样本中获取的高通量序列数据,为许多物种的进化历史带来了前所未有的见解,但对此类序列的分析也带来了特定的计算挑战。最常用的方法是将序列读数映射到参考基因组。然而,随着目标序列与参考序列之间遗传距离的增加,或者存在与目标物种具有高序列相似性的污染序列,这个过程变得越来越具有挑战性。因此,评估和测试映射效率和严格性对于可靠鉴定和分析古代序列至关重要。在本文中,我们展示了 “TAPAS”(古代样本比对参数测试),这是一种计算工具,通过模拟反映古代数据集特性的序列数据,并使用感兴趣的映射软件和参数设置进行试运行,能够对古代数据的映射工具进行系统测试。我们通过使用 TAPAS 来评估和改进来自条带林狸(目前没有密切相关的参考序列)的降解样本的映射策略,展示了 TAPAS 的作用。这使得映射读数数量增加了 1.8 倍,同时不牺牲映射特异性。映射读数的增加有效地减少了对额外测序的需求,从而更经济地利用时间、资源和样本材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/716c/5867878/d30f846dda9b/genes-09-00157-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验