Suppr超能文献

评估低频变异调用工具在检测短读长深度测序数据中的变异方面的性能。

Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.

机构信息

Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.

出版信息

Sci Rep. 2023 Nov 22;13(1):20444. doi: 10.1038/s41598-023-47135-3.

Abstract

Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, we evaluated four raw-reads-based variant callers (SiNVICT, outLyzer, Pisces, and LoFreq) and four UMI-based variant callers (DeepSNVMiner, MAGERI, smCounter2, and UMI-VarCal) considering their capability to call single nucleotide variants (SNVs) with allelic frequency as low as 0.025% in deep sequencing data. We analyzed a total of 54 simulated data with various sequencing depths and variant allele frequencies (VAFs), two reference data, and Horizon Tru-Q sample data. The results showed that the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers regarding detection limit. Sequencing depth had almost no effect on the UMI-based callers but significantly influenced on the raw-reads-based callers. Regardless of the sequencing depth, MAGERI showed the fastest analysis, while smCounter2 consistently took the longest to finish the variant calling process. Overall, DeepSNVMiner and UMI-VarCal performed the best with considerably good sensitivity and precision of 88%, 100%, and 84%, 100%, respectively. In conclusion, the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers in terms of sensitivity and precision. We recommend using DeepSNVMiner and UMI-VarCal for low-frequency variant detection. The results provide important information regarding future directions for reliable low-frequency variant detection and algorithm development, which is critical in genetics-based medical research and clinical applications.

摘要

高精度检测低频变异在生物医学研究和临床实践中起着重要作用。然而,由于下一代测序(NGS)的高错误率,实现这一目标具有挑战性。为了准确区分低水平的真实变异和这些错误,已经提出了许多用于调用低频变异的统计变异调用工具,但尚未对这些工具的系统性能进行比较。在这里,我们评估了四种基于原始读数的变异调用器(SiNVICT、outLyzer、Pisces 和 LoFreq)和四种基于 UMI 的变异调用器(DeepSNVMiner、MAGERI、smCounter2 和 UMI-VarCal),考虑了它们在深度测序数据中调用等位基因频率低至 0.025%的单核苷酸变异(SNV)的能力。我们分析了总共 54 种具有不同测序深度和变异等位基因频率(VAF)的模拟数据、两种参考数据和 Horizon Tru-Q 样本数据。结果表明,除了 smCounter2 之外,基于 UMI 的调用器在检测限方面优于基于原始读数的调用器。测序深度对基于 UMI 的调用器几乎没有影响,但对基于原始读数的调用器有显著影响。无论测序深度如何,MAGERI 的分析速度最快,而 smCounter2 始终需要最长的时间来完成变异调用过程。总体而言,DeepSNVMiner 和 UMI-VarCal 的性能最佳,灵敏度和精度分别为 88%和 100%、84%和 100%。总之,除了 smCounter2 之外,基于 UMI 的调用器在灵敏度和精度方面优于基于原始读数的调用器。我们建议使用 DeepSNVMiner 和 UMI-VarCal 进行低频变异检测。这些结果为未来可靠的低频变异检测和算法开发提供了重要信息,这对于基于遗传学的医学研究和临床应用至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e884/10665316/b262ee9ed860/41598_2023_47135_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验