• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

提高多重靶向下一代测序中的图谱构建和单核苷酸多态性(SNP)calling 性能。

Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing.

机构信息

Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany.

出版信息

BMC Genomics. 2012 Aug 22;13:417. doi: 10.1186/1471-2164-13-417.

DOI:10.1186/1471-2164-13-417
PMID:22913592
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3563481/
Abstract

BACKGROUND

Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions).

RESULTS

We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent 'read-backmapping' to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach.

CONCLUSIONS

We recommend applying our general 'two-step' mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.

摘要

背景

与经典的基因分型相比,靶向下一代测序(tNGS)可以定制设计以检测整个感兴趣的基因组区域,从而检测新的和已知的变体。为了降低每个样本的成本,一种方法是在样品富集之前对带有条形码的 NGS 文库进行混合。尽管如此,我们仍然缺乏对这种多路 tNGS 方法以及不断发展的分析工具的不同性能如何影响变体发现质量的全面了解。因此,我们评估了不同软件工具和分析方法对多路 tNGS 数据中单核苷酸多态性(SNP)发现的影响。为了生成我们自己的测试模型,我们在三个实验阶段(大肠杆菌基因、多路大肠杆菌和多路 HapMap BRCA1/2 区域)中结合了序列捕获方法和 NGS。

结果

我们成功地富集了带有条形码的 NGS 文库,而不是基因组 DNA,在多路样品中实现了可重复的覆盖谱(高达 0.99 的 Pearson 相关系数),链偏差<10%。然而,SNP 调用质量受到工具和映射策略选择的极大影响。为了降低计算要求,我们比较了常规的全基因组映射和 SNP 调用与一种新的更快方法:目标区域映射,然后“回映射”到整个基因组,以降低假阳性率。因此,我们开发了一种组合映射管道,包括标准工具(BWA、SAMtools 等),并在来自 1000 基因组计划的公共 HiSeq2000 外显子数据上对其进行了测试。我们的管道为每个 HiSeq2000 外显子样本节省了 12 小时的运行时间,并比常规全基因组方法检测到了约 5%更多的 SNP。这表明,使用这两种方法可能会比仅使用常规方法发现更多潜在的新 SNP。

结论

我们建议在 tNGS 中应用我们通用的“两步”映射方法,以更有效地发现 SNP。我们的研究还表明,计算 SNP 一致性并检查读取比对以获得更可靠的结果是有益的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/a732976b5bbc/1471-2164-13-417-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/c11de85c955d/1471-2164-13-417-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/0e410ba2ad50/1471-2164-13-417-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/d9c0503b5ede/1471-2164-13-417-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/a732976b5bbc/1471-2164-13-417-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/c11de85c955d/1471-2164-13-417-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/0e410ba2ad50/1471-2164-13-417-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/d9c0503b5ede/1471-2164-13-417-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb8/3563481/a732976b5bbc/1471-2164-13-417-4.jpg

相似文献

1
Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing.提高多重靶向下一代测序中的图谱构建和单核苷酸多态性(SNP)calling 性能。
BMC Genomics. 2012 Aug 22;13:417. doi: 10.1186/1471-2164-13-417.
2
An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.利用来自小型真核生物基因组的模拟读数对单核苷酸多态性假阳性原因的调查。
BMC Bioinformatics. 2015 Nov 11;16:382. doi: 10.1186/s12859-015-0801-z.
3
One Size Doesn't Fit All - RefEditor: Building Personalized Diploid Reference Genome to Improve Read Mapping and Genotype Calling in Next Generation Sequencing Studies.一刀切并不适用——RefEditor:构建个性化二倍体参考基因组以改善下一代测序研究中的读段映射和基因型调用
PLoS Comput Biol. 2015 Aug 12;11(8):e1004448. doi: 10.1371/journal.pcbi.1004448. eCollection 2015 Aug.
4
Impact of post-alignment processing in variant discovery from whole exome data.全外显子数据变异发现中比对后处理的影响
BMC Bioinformatics. 2016 Oct 3;17(1):403. doi: 10.1186/s12859-016-1279-z.
5
Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。
BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.
6
Variant callers for next-generation sequencing data: a comparison study.下一代测序数据的变异调用者:一项比较研究。
PLoS One. 2013 Sep 27;8(9):e75619. doi: 10.1371/journal.pone.0075619. eCollection 2013.
7
Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms.利用改良的简化代表性测序和 SNP 调用算法的直接比较,生成猩猩群体基因组学的 SNP 数据集。
BMC Genomics. 2014 Jan 10;15:16. doi: 10.1186/1471-2164-15-16.
8
Consensus Genotyper for Exome Sequencing (CGES): improving the quality of exome variant genotypes.外显子组测序一致性基因分型器(CGES):提高外显子组变异基因型的质量
Bioinformatics. 2015 Jan 15;31(2):187-93. doi: 10.1093/bioinformatics/btu591. Epub 2014 Sep 29.
9
Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels.用于分析NGS种系基因检测板的映射、变异位点检测和区间填充流程的性能评估。
BMC Bioinformatics. 2021 Apr 28;22(1):218. doi: 10.1186/s12859-021-04144-1.
10
SNP calling by sequencing pooled samples.基于测序的混合样本 SNP 检测。
BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239.

引用本文的文献

1
Navigating the rapids: the development of regulated next-generation sequencing-based clinical trial assays and companion diagnostics.穿越激流:基于新一代测序的临床试验检测方法及伴随诊断的发展历程
Front Oncol. 2014 Apr 17;4:78. doi: 10.3389/fonc.2014.00078. eCollection 2014.
2
Accurate variant detection across non-amplified and whole genome amplified DNA using targeted next generation sequencing.使用靶向下一代测序技术对非扩增和全基因组扩增 DNA 进行准确的变异检测。
BMC Genomics. 2012 Sep 20;13:500. doi: 10.1186/1471-2164-13-500.

本文引用的文献

1
From next-generation sequencing alignments to accurate comparison and validation of single-nucleotide variants: the pibase software.从下一代测序比对到单核苷酸变异的精确比较和验证:pibase 软件。
Nucleic Acids Res. 2013 Jan 7;41(1):e16. doi: 10.1093/nar/gks836. Epub 2012 Sep 10.
2
Targeted enrichment of genomic DNA regions for next-generation sequencing.靶向富集基因组 DNA 区域进行下一代测序。
Brief Funct Genomics. 2011 Nov;10(6):374-86. doi: 10.1093/bfgp/elr033. Epub 2011 Nov 26.
3
The variant call format and VCFtools.变异调用格式和 VCFtools。
Bioinformatics. 2011 Aug 1;27(15):2156-8. doi: 10.1093/bioinformatics/btr330. Epub 2011 Jun 7.
4
Integrative genomics viewer.整合基因组浏览器。
Nat Biotechnol. 2011 Jan;29(1):24-6. doi: 10.1038/nbt.1754.
5
Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci.全基因组荟萃分析将确认的克罗恩病易感性位点数量增加到 71 个。
Nat Genet. 2010 Dec;42(12):1118-25. doi: 10.1038/ng.717.
6
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
7
Mutation discovery by targeted genomic enrichment of multiplexed barcoded samples.通过靶向多重标记样本的基因组富集发现突变。
Nat Methods. 2010 Nov;7(11):913-5. doi: 10.1038/nmeth.1516. Epub 2010 Oct 17.
8
SNP discovery performance of two second-generation sequencing platforms in the NOD2 gene region.两种二代测序平台在 NOD2 基因区域的 SNP 发现性能。
Hum Mutat. 2010 Jul;31(7):875-85. doi: 10.1002/humu.21276.
9
Target-enrichment strategies for next-generation sequencing.下一代测序的靶向富集策略。
Nat Methods. 2010 Feb;7(2):111-8. doi: 10.1038/nmeth.1419.
10
BEDTools: a flexible suite of utilities for comparing genomic features.BEDTools:一套灵活的基因组特征比较工具套件。
Bioinformatics. 2010 Mar 15;26(6):841-2. doi: 10.1093/bioinformatics/btq033. Epub 2010 Jan 28.