• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过映射到序列变异图来去除古 DNA 数据分析中的参考偏倚并提高插入缺失调用。

Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.

机构信息

Department of Genetics, University of Cambridge, Cambridge, CB3 0DH, UK.

Wellcome Sanger Institute, Cambridge, CB10 1SA, UK.

出版信息

Genome Biol. 2020 Sep 17;21(1):250. doi: 10.1186/s13059-020-02160-7.

DOI:10.1186/s13059-020-02160-7
PMID:32943086
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7499850/
Abstract

BACKGROUND

During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of aDNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modifications. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Alternative approaches have been developed to replace the linear reference with a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for aDNA and compare with existing methods.

RESULTS

We use vg to align simulated and real aDNA samples to a variation graph containing 1000 Genome Project variants and compare with the same data aligned with bwa to the human linear reference genome. Using vg leads to a balanced allelic representation at polymorphic sites, effectively removing reference bias, and more sensitive variant detection in comparison with bwa, especially for insertions and deletions (indels). Alternative approaches that use relaxed bwa parameter settings or filter bwa alignments can also reduce bias but can have lower sensitivity than vg, particularly for indels.

CONCLUSIONS

Our findings demonstrate that aligning aDNA sequences to variation graphs effectively mitigates the impact of reference bias when analyzing aDNA, while retaining mapping sensitivity and allowing detection of variation, in particular indel variation, that was previously missed.

摘要

背景

在过去的十年中,对古代 DNA(aDNA)序列的分析已成为研究过去人类群体的有力工具。然而,aDNA 的降解性质意味着 aDNA 分子较短,并且经常受到死后化学修饰的突变。这些特征降低了读取映射的准确性,并增加了参考偏差,其中包含非参考等位基因的读取比包含参考等位基因的读取更不可能被映射。已经开发了替代方法来用包含每个遗传基因座的已知替代变体的变异图代替线性参考。在这里,我们评估了使用变异图软件 vg 来避免 aDNA 的参考偏差,并与现有方法进行比较。

结果

我们使用 vg 将模拟和真实的 aDNA 样本与包含 1000 个基因组计划变体的变异图对齐,并将其与用 bwa 对齐到人类线性参考基因组的相同数据进行比较。与 bwa 相比,使用 vg 可在多态性位点上实现平衡的等位基因表示,有效消除参考偏差,并提高变体检测的敏感性,尤其是对于插入和缺失(indels)。使用放宽 bwa 参数设置或过滤 bwa 比对的替代方法也可以减少偏差,但与 vg 相比,敏感性较低,特别是对于 indels。

结论

我们的研究结果表明,将 aDNA 序列与变异图对齐可有效减轻分析 aDNA 时参考偏差的影响,同时保留映射敏感性,并允许检测以前错过的变异,特别是插入缺失(indel)变异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/b0db557a24f4/13059_2020_2160_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/9ed041f93b06/13059_2020_2160_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/8bfae1e74b3a/13059_2020_2160_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/68f8387e7c73/13059_2020_2160_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/b0db557a24f4/13059_2020_2160_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/9ed041f93b06/13059_2020_2160_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/8bfae1e74b3a/13059_2020_2160_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/68f8387e7c73/13059_2020_2160_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e8/7499850/b0db557a24f4/13059_2020_2160_Fig4_HTML.jpg

相似文献

1
Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.通过映射到序列变异图来去除古 DNA 数据分析中的参考偏倚并提高插入缺失调用。
Genome Biol. 2020 Sep 17;21(1):250. doi: 10.1186/s13059-020-02160-7.
2
Systematic benchmark of ancient DNA read mapping.系统评估古 DNA 读段映射。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab076.
3
Improving ancient DNA read mapping against modern reference genomes.提高古代 DNA 读取与现代参考基因组的比对。
BMC Genomics. 2012 May 10;13:178. doi: 10.1186/1471-2164-13-178.
4
Analysis of optimal alignments unfolds aligners' bias in existing variant profiles.对最佳比对的分析揭示了现有变异图谱中比对工具的偏差。
BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):349. doi: 10.1186/s12859-016-1216-1.
5
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.
6
Fast alignment of reads to a variation graph with application to SNP detection.快速将读取内容与变异图谱对齐,应用于 SNP 检测。
J Integr Bioinform. 2021 Nov 16;18(4):20210032. doi: 10.1515/jib-2021-0032.
7
Bovine breed-specific augmented reference graphs facilitate accurate sequence read mapping and unbiased variant discovery.牛种特异性增强参考图谱有助于准确的序列读取映射和无偏的变异发现。
Genome Biol. 2020 Jul 27;21(1):184. doi: 10.1186/s13059-020-02105-0.
8
Unravelling reference bias in ancient DNA datasets.揭示古代DNA数据集中的参考偏差
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae436.
9
A complete pedigree-based graph workflow for rare candidate variant analysis.基于家系的罕见候选变异分析全流程图谱工作流。
Genome Res. 2022 May;32(5):893-903. doi: 10.1101/gr.276387.121. Epub 2022 Apr 28.
10
Placing Ancient DNA Sequences into Reference Phylogenies.将古 DNA 序列放入参考系统发育树中。
Mol Biol Evol. 2022 Feb 3;39(2). doi: 10.1093/molbev/msac017.

引用本文的文献

1
A New Perspective on the Arrival of the Eastern Mediterranean Genetic Influx in Central Italy Before the Onset of the Roman Empire.罗马帝国建立之前意大利中部东地中海基因涌入到来的新视角。
Genome Biol Evol. 2025 Jul 30;17(8). doi: 10.1093/gbe/evaf149.
2
The genomic footprints of migration: how ancient DNA reveals our history of mobility.迁徙的基因组印记:古代DNA如何揭示我们的迁徙历史。
Genome Biol. 2025 Jul 16;26(1):206. doi: 10.1186/s13059-025-03664-w.
3
Pangenome-aware DeepVariant.全基因组感知深度变异体

本文引用的文献

1
Genotyping structural variants in pangenome graphs using the vg toolkit.使用vg工具包对泛基因组图谱中的结构变异进行基因分型。
Genome Biol. 2020 Feb 12;21(1):35. doi: 10.1186/s13059-020-1941-7.
2
Pangenomics Comes of Age: From Bacteria to Plant and Animal Applications.泛基因组学时代的到来:从细菌到动植物的应用。
Trends Genet. 2020 Feb;36(2):132-145. doi: 10.1016/j.tig.2019.11.006. Epub 2019 Dec 24.
3
Sequence tube maps: making graph genomes intuitive to commuters.序列管图:让图基因组更容易被通勤者理解。
bioRxiv. 2025 Jun 6:2025.06.05.657102. doi: 10.1101/2025.06.05.657102.
4
A survey of sequence-to-graph mapping algorithms in the pangenome era.泛基因组时代序列到图谱映射算法综述。
Genome Biol. 2025 May 22;26(1):138. doi: 10.1186/s13059-025-03606-6.
5
A Pangenomic Approach to Improve Population Genetics Analysis and Reference Bias in Underrepresented Middle Eastern and Horn of Africa Populations.一种改进中东和非洲之角代表性不足人群的群体遗传学分析及参考偏倚的泛基因组方法。
Biomolecules. 2025 Apr 15;15(4):582. doi: 10.3390/biom15040582.
6
Comparative population pangenomes reveal unexpected complexity and fitness effects of structural variants.比较群体泛基因组揭示了结构变异出人意料的复杂性和适应性效应。
bioRxiv. 2025 Feb 13:2025.02.11.637762. doi: 10.1101/2025.02.11.637762.
7
Pre-processing of paleogenomes: mitigating reference bias and postmortem damage in ancient genome data.古基因组的预处理:减轻古代基因组数据中的参考偏差和死后损伤
Genome Biol. 2025 Jan 9;26(1):6. doi: 10.1186/s13059-024-03462-w.
8
Pangenome graphs and their applications in biodiversity genomics.泛基因组图谱及其在生物多样性基因组学中的应用。
Nat Genet. 2025 Jan;57(1):13-26. doi: 10.1038/s41588-024-02029-6. Epub 2025 Jan 8.
9
Resolving the source of branch length variation in the Y chromosome phylogeny.解析Y染色体系统发育中分支长度变异的来源。
Genome Biol. 2025 Jan 6;26(1):4. doi: 10.1186/s13059-024-03468-4.
10
Filtering out the noise: metagenomic classifiers optimize ancient DNA mapping.滤除噪音:宏基因组分类器优化古DNA图谱绘制。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae646.
Bioinformatics. 2019 Dec 15;35(24):5318-5320. doi: 10.1093/bioinformatics/btz597.
4
The presence and impact of reference bias on population genomic studies of prehistoric human populations.史前人类群体的种群基因组研究中参考偏倚的存在和影响。
PLoS Genet. 2019 Jul 26;15(7):e1008302. doi: 10.1371/journal.pgen.1008302. eCollection 2019 Jul.
5
Nuclear DNA from two early Neandertals reveals 80,000 years of genetic continuity in Europe.来自两个早期尼安德特人的核 DNA 揭示了欧洲 8 万年的遗传连续性。
Sci Adv. 2019 Jun 26;5(6):eaaw5873. doi: 10.1126/sciadv.aaw5873. eCollection 2019 Jun.
6
The Promise of Paleogenomics Beyond Our Own Species.古基因组学超越人类自身的承诺。
Trends Genet. 2019 May;35(5):319-329. doi: 10.1016/j.tig.2019.02.006. Epub 2019 Apr 4.
7
Reconstructing the Deep Population History of Central and South America.重建中美洲和南美洲的人口深史。
Cell. 2018 Nov 15;175(5):1185-1197.e22. doi: 10.1016/j.cell.2018.10.027. Epub 2018 Nov 8.
8
Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences.量化和减少超短古 DNA 序列分析中的虚假比对。
BMC Biol. 2018 Oct 25;16(1):121. doi: 10.1186/s12915-018-0581-9.
9
Variation graph toolkit improves read mapping by representing genetic variation in the reference.变异图谱工具包通过表示参考中的遗传变异来提高读映射质量。
Nat Biotechnol. 2018 Oct;36(9):875-879. doi: 10.1038/nbt.4227. Epub 2018 Aug 20.
10
snpAD: an ancient DNA genotype caller.snpAD:一种古老的 DNA 基因型调用器。
Bioinformatics. 2018 Dec 15;34(24):4165-4171. doi: 10.1093/bioinformatics/bty507.