Suppr超能文献

UMI-VarCal:一种基于 UMI 的新型变异 caller,可有效提高配对末端测序 NGS 文库中低频变异的检测能力。

UMI-VarCal: a new UMI-based variant caller that efficiently improves low-frequency variant detection in paired-end sequencing NGS libraries.

机构信息

University of Normandie UNIROUEN, LITIS EA 4108.

Department of Pathology, Centre Henri Becquerel.

出版信息

Bioinformatics. 2020 May 1;36(9):2718-2724. doi: 10.1093/bioinformatics/btaa053.

Abstract

MOTIVATION

Next-generation sequencing has become the go-to standard method for the detection of single-nucleotide variants in tumor cells. The use of such technologies requires a PCR amplification step and a sequencing step, steps in which artifacts are introduced at very low frequencies. These artifacts are often confused with true low-frequency variants that can be found in tumor cells and cell-free DNA. The recent use of unique molecular identifiers (UMI) in targeted sequencing protocols has offered a trustworthy approach to filter out artefactual variants and accurately call low-frequency variants. However, the integration of UMI analysis in the variant calling process led to developing tools that are significantly slower and more memory consuming than raw-reads-based variant callers.

RESULTS

We present UMI-VarCal, a UMI-based variant caller for targeted sequencing data with better sensitivity compared to other variant callers. Being developed with performance in mind, UMI-VarCal stands out from the crowd by being one of the few variant callers that do not rely on SAMtools to do their pileup. Instead, at its core runs an innovative homemade pileup algorithm specifically designed to treat the UMI tags in the reads. After the pileup, a Poisson statistical test is applied at every position to determine if the frequency of the variant is significantly higher than the background error noise. Finally, an analysis of UMI tags is performed, a strand bias and a homopolymer length filter are applied to achieve better accuracy. We illustrate the results obtained using UMI-VarCal through the sequencing of tumor samples and we show how UMI-VarCal is both faster and more sensitive than other publicly available solutions.

AVAILABILITY AND IMPLEMENTATION

The entire pipeline is available at https://gitlab.com/vincent-sater/umi-varcal-master under MIT license.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

下一代测序已成为检测肿瘤细胞中单核苷酸变异的首选标准方法。此类技术的使用需要 PCR 扩增步骤和测序步骤,在这些步骤中,以非常低的频率引入了伪影。这些伪影通常与肿瘤细胞和游离 DNA 中发现的真实低频变异混淆。最近在靶向测序方案中使用独特的分子标识符 (UMI) 提供了一种可靠的方法来滤除人为变异并准确地调用低频变异。然而,UMI 分析在变异调用过程中的集成导致开发出的工具比原始读数变异调用者慢得多,消耗的内存也多得多。

结果

我们提出了 UMI-VarCal,这是一种基于 UMI 的靶向测序数据变异调用器,与其他变异调用器相比具有更好的灵敏度。为了提高性能而开发的 UMI-VarCal 脱颖而出,它是少数不依赖 SAMtools 进行堆积的变异调用器之一。相反,它的核心运行一个创新的自制堆积算法,专门用于处理读取中的 UMI 标签。堆积后,在每个位置应用泊松统计检验,以确定变异的频率是否明显高于背景错误噪声。最后,对 UMI 标签进行分析,应用链偏倚和长重复过滤器以实现更好的准确性。我们通过对肿瘤样本进行测序来说明使用 UMI-VarCal 获得的结果,并展示 UMI-VarCal 如何比其他可用的解决方案更快、更敏感。

可用性和实现

整个管道可在 MIT 许可证下在 https://gitlab.com/vincent-sater/umi-varcal-master 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验