评估用于非配对下一代测序数据的变异调用工具。

Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data.

机构信息

Institute of Medical Informatics, University of Münster, Münster, 48149, Germany.

Laboratory Hematology, RadboudUMC, Nijmegen, 6525, Netherlands.

出版信息

Sci Rep. 2017 Feb 24;7:43169. doi: 10.1038/srep43169.

DOI:10.1038/srep43169

PMID:28233799

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5324109/

Abstract

Valid variant calling results are crucial for the use of next-generation sequencing in clinical routine. However, there are numerous variant calling tools that usually differ in algorithms, filtering strategies, recommendations and thus, also in the output. We evaluated eight open-source tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data: GATK HaplotypeCaller, Platypus, VarScan, LoFreq, FreeBayes, SNVer, SAMtools and VarDict. We analysed two real datasets from patients with myelodysplastic syndrome, covering 54 Illumina HiSeq samples and 111 Illumina NextSeq samples. Mutations were validated by re-sequencing on the same platform, on a different platform and expert based review. In addition we considered two simulated datasets with varying coverage and error profiles, covering 50 samples each. In all cases an identical target region consisting of 19 genes (42,322 bp) was analysed. Altogether, no tool succeeded in calling all mutations. High sensitivity was always accompanied by low precision. Influence of varying coverages- and background noise on variant calling was generally low. Taking everything into account, VarDict performed best. However, our results indicate that there is a need to improve reproducibility of the results in the context of multithreading.

摘要

有效的变异调用结果对于将下一代测序技术应用于临床常规至关重要。然而，有许多变异调用工具，它们通常在算法、过滤策略、建议等方面存在差异，因此输出结果也不同。我们评估了八种开源工具在非配对下一代测序数据中调用单核苷酸变异和短插入/缺失的能力，等位基因频率低至 1%：GATK HaplotypeCaller、Platypus、VarScan、LoFreq、FreeBayes、SNVer、SAMtools 和 VarDict。我们分析了来自骨髓增生异常综合征患者的两个真实数据集，涵盖了 54 个 Illumina HiSeq 样本和 111 个 Illumina NextSeq 样本。通过在同一平台、不同平台和专家评审上重新测序来验证突变。此外，我们还考虑了两个具有不同覆盖范围和误差分布的模拟数据集，每个数据集涵盖 50 个样本。在所有情况下，分析了一个包含 19 个基因（42,322 bp）的相同目标区域。总的来说，没有一个工具能够成功调用所有的突变。高灵敏度总是伴随着低精度。覆盖范围和背景噪声对变异调用的影响通常较低。综合考虑，VarDict 的性能最佳。然而，我们的结果表明，需要提高多线程环境下结果的重现性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f17/5324109/32221c45f672/srep43169-f1.jpg

相似文献

Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data.评估用于非配对下一代测序数据的变异调用工具。

Sci Rep. 2017 Feb 24;7:43169. doi: 10.1038/srep43169.

Evaluation of variant detection software for pooled next-generation sequence data.用于混合下一代测序数据的变异检测软件评估

BMC Bioinformatics. 2015 Jul 29;16:235. doi: 10.1186/s12859-015-0624-y.

appreci8: a pipeline for precise variant calling integrating 8 tools.appreci8：一个集成了 8 种工具的精确变异调用管道。

Bioinformatics. 2018 Dec 15;34(24):4205-4212. doi: 10.1093/bioinformatics/bty518.

Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.两种外显子捕获试剂盒和测序平台用于变异检测的比较与评估

BMC Genomics. 2015 Aug 5;16(1):581. doi: 10.1186/s12864-015-1796-6.

Challenges in exome analysis by LifeScope and its alternative computational pipelines.LifeScope及其替代计算流程在全外显子组分析中的挑战。

BMC Res Notes. 2015 Sep 7;8:421. doi: 10.1186/s13104-015-1385-4.

Evaluation of variant calling tools for large plant genome re-sequencing.评价用于大型植物基因组重测序的变异调用工具。

BMC Bioinformatics. 2020 Aug 17;21(1):360. doi: 10.1186/s12859-020-03704-1.

VariantMetaCaller: automated fusion of variant calling pipelines for quantitative, precision-based filtering.变异元调用器：用于基于定量、精确性筛选的变异调用流程的自动融合。

BMC Genomics. 2015 Oct 28;16:875. doi: 10.1186/s12864-015-2050-y.

Comparison of INDEL Calling Tools with Simulation Data and Real Short-Read Data.比较 INDEL 调用工具与模拟数据和真实短读数据。

IEEE/ACM Trans Comput Biol Bioinform. 2019 Sep-Oct;16(5):1635-1644. doi: 10.1109/TCBB.2018.2854793. Epub 2018 Jul 10.

GATK hard filtering: tunable parameters to improve variant calling for next generation sequencing targeted gene panel data.GATK严格过滤：用于改进针对下一代测序靶向基因panel数据的变异检测的可调参数。

BMC Bioinformatics. 2017 Mar 23;18(Suppl 5):119. doi: 10.1186/s12859-017-1537-8.

Systematic comparison of germline variant calling pipelines cross multiple next-generation sequencers.跨多种下一代测序仪的种系变异调用管道的系统比较。

Sci Rep. 2019 Jun 27;9(1):9345. doi: 10.1038/s41598-019-45835-3.

引用本文的文献

The Role of Somatic Mutation in Hereditary Hemorrhagic Telangiectasia Pathogenesis.体细胞突变在遗传性出血性毛细血管扩张症发病机制中的作用。

J Clin Med. 2025 Jun 24;14(13):4479. doi: 10.3390/jcm14134479.

Recurrent spontaneous miscarriages from sperm after ABVD chemotherapy in a patient with Hodgkin's lymphoma: sperm DNA and methylation profiling.霍奇金淋巴瘤患者接受ABVD化疗后精子导致的复发性自然流产：精子DNA和甲基化分析

Asian J Androl. 2025 Sep 1;27(5):598-610. doi: 10.4103/aja2024107. Epub 2025 Apr 15.

Assessing myBaits Target Capture Sequencing Methodology Using Short-Read Sequencing for Variant Detection in Oat Genomics and Breeding.利用短读测序评估 myBaits 目标捕获测序方法在 oat 基因组学和育种中的变异检测。

Genes (Basel). 2024 May 27;15(6):700. doi: 10.3390/genes15060700.

Differential requirement for RecFOR pathway components in Thermus thermophilus.热球菌中 RecFOR 途径组分的差异需求。

Environ Microbiol Rep. 2024 Jun;16(3):e13269. doi: 10.1111/1758-2229.13269.

Transposon DNA sequences facilitate the tissue-specific gene transfer of circulating tumor DNA between human cells.转座子 DNA 序列促进了循环肿瘤 DNA 在人类细胞间的组织特异性基因转移。

Nucleic Acids Res. 2024 Jul 22;52(13):7539-7555. doi: 10.1093/nar/gkae427.

Fast and accurate variant identification tool for sequencing-based studies.用于基于测序的研究的快速准确的变异识别工具。

BMC Biol. 2024 Apr 22;22(1):90. doi: 10.1186/s12915-024-01891-4.

ArCH: improving the performance of clonal hematopoiesis variant calling and interpretation.ArCH：提高克隆性造血变异体检测和解释的性能。

Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae121.

Systematic comparison of variant calling pipelines of target genome sequencing cross multiple next-generation sequencers.跨多个下一代测序仪对目标基因组测序变异检测流程的系统比较。

Front Genet. 2024 Jan 4;14:1293974. doi: 10.3389/fgene.2023.1293974. eCollection 2023.

Comparison of Nanopore and Synthesis-Based Next-Generation Sequencing Platforms for SARS-CoV-2 Variant Monitoring in Wastewater.基于纳米孔和合成的下一代测序平台在废水中用于 SARS-CoV-2 变异监测的比较。

Int J Mol Sci. 2023 Dec 6;24(24):17184. doi: 10.3390/ijms242417184.

Performance analysis of conventional and AI-based variant callers using short and long reads.使用短读长读对常规和基于人工智能的变异调用程序进行性能分析。

BMC Bioinformatics. 2023 Dec 14;24(1):472. doi: 10.1186/s12859-023-05596-3.

本文引用的文献

From Wet-Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing.从湿实验室到变异：全基因组和全外显子组测序的生物信息学流程的一致性和速度

Hum Mutat. 2016 Dec;37(12):1263-1271. doi: 10.1002/humu.23114. Epub 2016 Sep 26.

Analysis of protein-coding genetic variation in 60,706 humans.对60706名人类的蛋白质编码基因变异进行分析。

Nature. 2016 Aug 18;536(7616):285-91. doi: 10.1038/nature19057.

VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research.VarDict：一种用于癌症研究中下一代测序的新型多功能变异检测工具。

Nucleic Acids Res. 2016 Jun 20;44(11):e108. doi: 10.1093/nar/gkw227. Epub 2016 Apr 7.

A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.利用全基因组测序对癌症中体细胞突变检测进行的全面评估。

Nat Commun. 2015 Dec 9;6:10001. doi: 10.1038/ncomms10001.

ClinVar: public archive of interpretations of clinically relevant variants.ClinVar：临床相关变异解读的公共存档库。

Nucleic Acids Res. 2016 Jan 4;44(D1):D862-8. doi: 10.1093/nar/gkv1222. Epub 2015 Nov 17.

Telomerase activation by genomic rearrangements in high-risk neuroblastoma.高危神经母细胞瘤中基因组重排导致的端粒酶激活

Nature. 2015 Oct 29;526(7575):700-4. doi: 10.1038/nature14980. Epub 2015 Oct 14.

A global reference for human genetic variation.人类遗传变异的全球参考。

Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.

Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.将肿瘤基因组模拟与众包相结合，以评估体细胞单核苷酸变异检测。

Nat Methods. 2015 Jul;12(7):623-30. doi: 10.1038/nmeth.3407. Epub 2015 May 18.

SF3B1 mutation identifies a distinct subset of myelodysplastic syndrome with ring sideroblasts.SF3B1突变可识别出伴有环形铁粒幼细胞的骨髓增生异常综合征的一个独特亚组。

Blood. 2015 Jul 9;126(2):233-41. doi: 10.1182/blood-2015-03-633537. Epub 2015 May 8.

Unified representation of genetic variants.基因变异的统一表示

Bioinformatics. 2015 Jul 1;31(13):2202-4. doi: 10.1093/bioinformatics/btv112. Epub 2015 Feb 19.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

评估用于非配对下一代测序数据的变异调用工具。

Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献