• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UMI-Gen:一种基于单分子唯一分子标识符(UMI)的读段模拟器,用于双端测序NGS文库中的变异检测评估。

UMI-Gen: A UMI-based read simulator for variant calling evaluation in paired-end sequencing NGS libraries.

作者信息

Sater Vincent, Viailly Pierre-Julien, Lecroq Thierry, Ruminy Philippe, Bérard Caroline, Prieur-Gaston Élise, Jardin Fabrice

机构信息

University of Rouen Normandy UNIROUEN, LITIS EA 4108, 76000 Rouen, France.

INSERM U1245, University of Rouen Normandy UNIROUEN, 76000 Rouen, France.

出版信息

Comput Struct Biotechnol J. 2020 Aug 27;18:2270-2280. doi: 10.1016/j.csbj.2020.08.011. eCollection 2020.

DOI:10.1016/j.csbj.2020.08.011
PMID:32952940
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7484502/
Abstract

MOTIVATION

With Next Generation Sequencing becoming more affordable every year, NGS technologies asserted themselves as the fastest and most reliable way to detect Single Nucleotide Variants (SNV) and Copy Number Variations (CNV) in cancer patients. These technologies can be used to sequence DNA at very high depths thus allowing to detect abnormalities in tumor cells with very low frequencies. Multiple variant callers are publicly available and are usually efficient at calling out variants. However, when frequencies begin to drop under 1%, the specificity of these tools suffers greatly as true variants at very low frequencies can be easily confused with sequencing or PCR artifacts. The recent use of Unique Molecular Identifiers (UMI) in NGS experiments has offered a way to accurately separate true variants from artifacts. UMI-based variant callers are slowly replacing raw-read based variant callers as the standard method for an accurate detection of variants at very low frequencies. However, benchmarking done in the tools publication are usually realized on real biological data in which real variants are not known, making it difficult to assess their accuracy.

RESULTS

We present UMI-Gen, a UMI-based read simulator for targeted sequencing paired-end data. UMI-Gen generates reference reads covering the targeted regions at a user customizable depth. After that, using a number of control files, it estimates the background error rate at each position and then modifies the generated reads to mimic real biological data. Finally, it will insert real variants in the reads from a list provided by the user.

AVAILABILITY

The entire pipeline is available at https://gitlab.com/vincent-sater/umigen under MIT license.

摘要

动机

随着新一代测序技术的成本逐年降低,NGS技术已成为检测癌症患者单核苷酸变异(SNV)和拷贝数变异(CNV)的最快且最可靠的方法。这些技术可用于对DNA进行非常高深度的测序,从而能够检测出频率极低的肿瘤细胞异常。有多种变异检测工具可供公开使用,并且通常在检测变异方面效率很高。然而,当频率降至1%以下时,这些工具的特异性会大幅下降,因为极低频率的真实变异很容易与测序或PCR假象混淆。最近在NGS实验中使用独特分子标识符(UMI)提供了一种将真实变异与假象准确区分开来的方法。基于UMI的变异检测工具正逐渐取代基于原始读数的变异检测工具,成为在极低频率下准确检测变异的标准方法。然而,工具出版物中进行的基准测试通常是在真实生物数据上实现的,其中真实变异并不已知,这使得难以评估它们的准确性。

结果

我们展示了UMI-Gen,一种用于靶向测序双端数据的基于UMI的读数模拟器。UMI-Gen以用户可定制的深度生成覆盖靶向区域的参考读数。之后,使用一些控制文件,它估计每个位置的背景错误率,然后修改生成的读数以模拟真实生物数据。最后,它会从用户提供的列表中在读数中插入真实变异。

可用性

整个流程可在https://gitlab.com/vincent-sater/umigen上获取,遵循MIT许可。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/723e69341d92/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/386416a2980f/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/556dfd2f4093/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/542854fd5331/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/e4032d6da0da/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/80e3a28209db/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/a01816549cfe/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/dfcccbb33ef8/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/89eb246b0e8a/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/7892a448e566/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/723e69341d92/gr10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/386416a2980f/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/556dfd2f4093/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/542854fd5331/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/e4032d6da0da/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/80e3a28209db/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/a01816549cfe/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/dfcccbb33ef8/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/89eb246b0e8a/gr8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/7892a448e566/gr9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b1/7484502/723e69341d92/gr10.jpg

相似文献

1
UMI-Gen: A UMI-based read simulator for variant calling evaluation in paired-end sequencing NGS libraries.UMI-Gen:一种基于单分子唯一分子标识符(UMI)的读段模拟器,用于双端测序NGS文库中的变异检测评估。
Comput Struct Biotechnol J. 2020 Aug 27;18:2270-2280. doi: 10.1016/j.csbj.2020.08.011. eCollection 2020.
2
UMI-VarCal: a new UMI-based variant caller that efficiently improves low-frequency variant detection in paired-end sequencing NGS libraries.UMI-VarCal:一种基于 UMI 的新型变异 caller,可有效提高配对末端测序 NGS 文库中低频变异的检测能力。
Bioinformatics. 2020 May 1;36(9):2718-2724. doi: 10.1093/bioinformatics/btaa053.
3
Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.评估低频变异调用工具在检测短读长深度测序数据中的变异方面的性能。
Sci Rep. 2023 Nov 22;13(1):20444. doi: 10.1038/s41598-023-47135-3.
4
UMI-Varcal: A Low-Frequency Variant Caller for UMI-Tagged Paired-End Sequencing Data.UMI-Varcal:一种用于 UMI 标记的双端测序数据的低频变异调用器。
Methods Mol Biol. 2022;2493:235-245. doi: 10.1007/978-1-0716-2293-3_14.
5
Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection.基于 UMIs 的低频 ctDNA 变异检测与标准变异 caller 的基准测试
BMC Genomics. 2024 Sep 3;25(1):827. doi: 10.1186/s12864-024-10737-w.
6
Improving high-resolution copy number variation analysis from next generation sequencing using unique molecular identifiers.使用独特分子标识符改进基于新一代测序的高分辨率拷贝数变异分析。
BMC Bioinformatics. 2021 Mar 12;22(1):120. doi: 10.1186/s12859-021-04060-4.
7
smCounter2: an accurate low-frequency variant caller for targeted sequencing data with unique molecular identifiers.smCounter2:一种带有独特分子标识符的靶向测序数据的精确低频变异调用器。
Bioinformatics. 2019 Apr 15;35(8):1299-1309. doi: 10.1093/bioinformatics/bty790.
8
Alignment-free clustering of UMI tagged DNA molecules.无比对聚类分析 UMI 标签化 DNA 分子。
Bioinformatics. 2019 Jun 1;35(11):1829-1836. doi: 10.1093/bioinformatics/bty888.
9
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.Gencore:一种高效的工具,用于生成共识读数,以抑制 NGS 数据的错误并去除重复。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):606. doi: 10.1186/s12859-019-3280-9.
10
Benchmarking datasets for assembly-based variant calling using high-fidelity long reads.基于高保真长读长的组装变异调用基准数据集。
BMC Genomics. 2023 Mar 27;24(1):148. doi: 10.1186/s12864-023-09255-y.

引用本文的文献

1
Unveiling the Future of Infective Endocarditis Diagnosis: The Transformative Role of Metagenomic Next-Generation Sequencing in Culture-Negative Cases.揭示感染性心内膜炎诊断的未来:宏基因组下一代测序在血培养阴性病例中的变革性作用
J Epidemiol Glob Health. 2025 Aug 22;15(1):108. doi: 10.1007/s44197-025-00455-1.
2
Liquid Biopsy in B and T Cell Lymphomas: From Bench to Bedside.B细胞和T细胞淋巴瘤的液体活检:从实验台到病床旁
Int J Mol Sci. 2025 May 19;26(10):4869. doi: 10.3390/ijms26104869.
3
Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection.

本文引用的文献

1
UMI-VarCal: a new UMI-based variant caller that efficiently improves low-frequency variant detection in paired-end sequencing NGS libraries.UMI-VarCal:一种基于 UMI 的新型变异 caller,可有效提高配对末端测序 NGS 文库中低频变异的检测能力。
Bioinformatics. 2020 May 1;36(9):2718-2724. doi: 10.1093/bioinformatics/btaa053.
2
SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms.SVSR:一个用于模拟结构变异并为多个平台生成测序读数的程序。
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):1082-1091. doi: 10.1109/TCBB.2018.2876527. Epub 2018 Oct 17.
3
A novel somatic mutation achieves partial rescue in a child with Hutchinson-Gilford progeria syndrome.
基于 UMIs 的低频 ctDNA 变异检测与标准变异 caller 的基准测试
BMC Genomics. 2024 Sep 3;25(1):827. doi: 10.1186/s12864-024-10737-w.
4
Prospects for liquid biopsy approaches in lymphomas.淋巴瘤液体活检方法的前景。
Leuk Lymphoma. 2024 Dec;65(13):1923-1933. doi: 10.1080/10428194.2024.2389210. Epub 2024 Aug 10.
5
Integrated approach to generate artificial samples with low tumor fraction for somatic variant calling benchmarking.综合方法生成低肿瘤分数的人工样本用于体细胞变异calling 基准测试。
BMC Bioinformatics. 2024 May 8;25(1):180. doi: 10.1186/s12859-024-05793-8.
6
Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data.评估低频变异调用工具在检测短读长深度测序数据中的变异方面的性能。
Sci Rep. 2023 Nov 22;13(1):20444. doi: 10.1038/s41598-023-47135-3.
7
Cloud-native distributed genomic pileup operations.云原生分布式基因组堆积操作。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac804.
8
cfDNA Sequencing: Technological Approaches and Bioinformatic Issues.游离DNA测序:技术方法与生物信息学问题
Pharmaceuticals (Basel). 2021 Jun 21;14(6):596. doi: 10.3390/ph14060596.
一种新型体细胞突变在一名患有哈钦森-吉尔福德早衰综合征的儿童中实现了部分挽救。
J Med Genet. 2017 Mar;54(3):212-216. doi: 10.1136/jmedgenet-2016-104295. Epub 2016 Dec 5.
4
OutLyzer: software for extracting low-allele-frequency tumor mutations from sequencing background noise in clinical practice.OutLyzer:用于在临床实践中从测序背景噪声中提取低等位基因频率肿瘤突变的软件。
Oncotarget. 2016 Nov 29;7(48):79485-79493. doi: 10.18632/oncotarget.13103.
5
Clonal haematopoiesis harbouring AML-associated mutations is ubiquitous in healthy adults.携带 AML 相关突变的克隆性造血在健康成年人中普遍存在。
Nat Commun. 2016 Aug 22;7:12484. doi: 10.1038/ncomms12484.
6
SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA.SiNVICT:循环肿瘤 DNA 中单核苷酸变异和插入缺失的超灵敏检测。
Bioinformatics. 2017 Jan 1;33(1):26-34. doi: 10.1093/bioinformatics/btw536. Epub 2016 Aug 16.
7
DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations.DeepSNVMiner:一种用于检测细胞群体亚群中新兴罕见突变的序列分析工具。
PeerJ. 2016 May 24;4:e2074. doi: 10.7717/peerj.2074. eCollection 2016.
8
IntSIM: An Integrated Simulator of Next-Generation Sequencing Data.IntSIM:下一代测序数据集成模拟器
IEEE Trans Biomed Eng. 2017 Feb;64(2):441-451. doi: 10.1109/TBME.2016.2560939. Epub 2016 Apr 29.
9
Integrated digital error suppression for improved detection of circulating tumor DNA.用于改善循环肿瘤DNA检测的集成数字误差抑制
Nat Biotechnol. 2016 May;34(5):547-555. doi: 10.1038/nbt.3520. Epub 2016 Mar 28.
10
Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data.Illumina错误概况:解析宏基因组测序数据中的精细尺度变异
BMC Bioinformatics. 2016 Mar 11;17:125. doi: 10.1186/s12859-016-0976-y.