• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在新一代和第三代测序分析中对变异调用程序进行基准测试。

Benchmarking variant callers in next-generation and third-generation sequencing analysis.

机构信息

Zhongshan Ophthalmic Center at Sun Yat-sen University and Annoroad Gene Technology (Beijing) Co., Ltd.

Annoroad Gene Technology (Beijing) Co., Ltd.

出版信息

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa148.

DOI:10.1093/bib/bbaa148
PMID:32698196
Abstract

DNA variants represent an important source of genetic variations among individuals. Next- generation sequencing (NGS) is the most popular technology for genome-wide variant calling. Third-generation sequencing (TGS) has also recently been used in genetic studies. Although many variant callers are available, no single caller can call both types of variants on NGS or TGS data with high sensitivity and specificity. In this study, we systematically evaluated 11 variant callers on 12 NGS and TGS datasets. For germline variant calling, we tested DNAseq and DNAscope modes from Sentieon, HaplotypeCaller mode from GATK and WGS mode from DeepVariant. All the four callers had comparable performance on NGS data and 30× coverage of WGS data was recommended. For germline variant calling on TGS data, we tested DNAseq mode from Sentieon, HaplotypeCaller mode from GATK and PACBIO mode from DeepVariant. All the three callers had similar performance in SNP calling, while DeepVariant outperformed the others in InDel calling. TGS detected more variants than NGS, particularly in complex and repetitive regions. For somatic variant calling on NGS, we tested TNscope and TNseq modes from Sentieon, MuTect2 mode from GATK, NeuSomatic, VarScan2, and Strelka2. TNscope and Mutect2 outperformed the other callers. A higher proportion of tumor sample purity (from 10 to 20%) significantly increased the recall value of calling. Finally, computational costs of the callers were compared and Sentieon required the least computational cost. These results suggest that careful selection of a tool and parameters is needed for accurate SNP or InDel calling under different scenarios.

摘要

DNA 变体代表个体间遗传变异的重要来源。下一代测序(NGS)是用于全基因组变异检测的最流行技术。第三代测序(TGS)最近也被用于遗传研究。虽然有许多变体调用者可供选择,但没有一个单一的调用者可以在 NGS 或 TGS 数据上以高灵敏度和特异性调用这两种类型的变体。在这项研究中,我们系统地评估了 11 种变体调用者在 12 种 NGS 和 TGS 数据集上的性能。对于种系变异调用,我们测试了 Sentieon 的 DNAseq 和 DNAscope 模式、GATK 的 HaplotypeCaller 模式和 DeepVariant 的 WGS 模式。所有这四个调用者在 NGS 数据上的性能相当,建议使用 30×的 WGS 数据覆盖。对于 TGS 数据的种系变异调用,我们测试了 Sentieon 的 DNAseq 模式、GATK 的 HaplotypeCaller 模式和 DeepVariant 的 PACBIO 模式。所有这三个调用者在 SNP 调用方面表现相似,而 DeepVariant 在 InDel 调用方面优于其他调用者。TGS 比 NGS 检测到更多的变体,特别是在复杂和重复区域。对于 NGS 上的体细胞变异调用,我们测试了 Sentieon 的 TNscope 和 TNseq 模式、GATK 的 MuTect2 模式、NeuSomatic、VarScan2 和 Strelka2。TNscope 和 Mutect2 优于其他调用者。肿瘤样本纯度(从 10%到 20%)的比例增加显著提高了调用的召回值。最后,比较了调用者的计算成本,Sentieon 需要的计算成本最少。这些结果表明,在不同的情况下,需要仔细选择工具和参数,以实现 SNP 或 InDel 调用的准确性。

相似文献

1
Benchmarking variant callers in next-generation and third-generation sequencing analysis.在新一代和第三代测序分析中对变异调用程序进行基准测试。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa148.
2
Variant callers for next-generation sequencing data: a comparison study.下一代测序数据的变异调用者:一项比较研究。
PLoS One. 2013 Sep 27;8(9):e75619. doi: 10.1371/journal.pone.0075619. eCollection 2013.
3
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations.SNVSniffer:一种用于种系和体细胞单核苷酸及插入缺失突变的综合检测工具。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):47. doi: 10.1186/s12918-016-0300-5.
4
Comparison of GATK and DeepVariant by trio sequencing.基于 trio 测序的 GATK 和 DeepVariant 比较。
Sci Rep. 2022 Feb 2;12(1):1809. doi: 10.1038/s41598-022-05833-4.
5
Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery.系统基准测试最先进的变异调用管道,确定影响编码序列变异发现准确性的主要因素。
BMC Genomics. 2022 Feb 22;23(1):155. doi: 10.1186/s12864-022-08365-3.
6
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data.基准测试显示深度学习变异调用程序在细菌纳米孔测序数据上的优越性。
Elife. 2024 Oct 10;13:RP98300. doi: 10.7554/eLife.98300.
7
Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection.基于 UMIs 的低频 ctDNA 变异检测与标准变异 caller 的基准测试
BMC Genomics. 2024 Sep 3;25(1):827. doi: 10.1186/s12864-024-10737-w.
8
INDELseek: detection of complex insertions and deletions from next-generation sequencing data.INDELseek:从下一代测序数据中检测复杂插入和缺失
BMC Genomics. 2017 Jan 5;18(1):16. doi: 10.1186/s12864-016-3449-9.
9
Systematic comparison of germline variant calling pipelines cross multiple next-generation sequencers.跨多种下一代测序仪的种系变异调用管道的系统比较。
Sci Rep. 2019 Jun 27;9(1):9345. doi: 10.1038/s41598-019-45835-3.
10
ICR142 Benchmarker: evaluating, optimising and benchmarking variant calling performance using the ICR142 NGS validation series.ICR142基准测试工具:使用ICR142二代测序验证系列评估、优化和基准测试变异检测性能
Wellcome Open Res. 2018 Oct 31;3:108. doi: 10.12688/wellcomeopenres.14754.2. eCollection 2018.

引用本文的文献

1
Exploring potential therapeutic targets for colorectal tumors based on whole genome sequencing of colorectal tumors and paracancerous tissues.基于结直肠癌及癌旁组织全基因组测序探索结直肠癌潜在治疗靶点
Front Mol Biosci. 2025 Jul 4;12:1605117. doi: 10.3389/fmolb.2025.1605117. eCollection 2025.
2
Decision level scheme for fusing multiomics and histology slide images using deep neural network for tumor prognosis prediction.使用深度神经网络融合多组学和组织学切片图像以进行肿瘤预后预测的决策水平方案。
Sci Rep. 2025 Jul 15;15(1):25479. doi: 10.1038/s41598-025-09869-0.
3
hDNApipe: streamlining human genome analysis and interpretation with an intuitive and user-friendly interface.
hDNApipe:通过直观且用户友好的界面简化人类基因组分析与解读。
NAR Genom Bioinform. 2025 Jun 26;7(2):lqaf088. doi: 10.1093/nargab/lqaf088. eCollection 2025 Jun.
4
Clonal haematopoiesis of indeterminate potential: a risk factor for future exacerbation in patients with COPD.意义未明的克隆性造血:慢性阻塞性肺疾病患者未来病情加重的一个危险因素。
ERJ Open Res. 2025 Jun 23;11(3). doi: 10.1183/23120541.00292-2024. eCollection 2025 May.
5
Investigating the Performance of Oxford Nanopore Long-Read Sequencing with Respect to Illumina Microarrays and Short-Read Sequencing.研究牛津纳米孔长读长测序相对于Illumina微阵列和短读长测序的性能。
Int J Mol Sci. 2025 May 8;26(10):4492. doi: 10.3390/ijms26104492.
6
Artificial intelligence in variant calling: a review.变异检测中的人工智能:综述
Front Bioinform. 2025 Apr 23;5:1574359. doi: 10.3389/fbinf.2025.1574359. eCollection 2025.
7
Analytical validation of germline small variant detection using long-read HiFi genome sequencing.使用长读长HiFi基因组测序进行种系小变异检测的分析验证
Genome Res. 2025 Jun 2;35(6):1391-1399. doi: 10.1101/gr.278836.123.
8
Integration of proteomics profiling data to facilitate discovery of cancer neoantigens: a survey.整合蛋白质组学分析数据以促进癌症新抗原的发现:一项综述。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf087.
9
The additional diagnostic yield of long-read sequencing in undiagnosed rare diseases.长读长测序在未确诊罕见病中的额外诊断价值。
Genome Res. 2025 Apr 14;35(4):559-571. doi: 10.1101/gr.279970.124.
10
Combined Genome-Wide Association Study and Linkage Analysis for Mining Candidate Genes for the Kernel Row Number in Maize ( L.).玉米(L.)穗行数候选基因挖掘的全基因组关联研究与连锁分析联合分析
Plants (Basel). 2024 Nov 26;13(23):3308. doi: 10.3390/plants13233308.