Stosic Milan S, Costanzi Jean-Marc, Ambur Ole Herman, Rounge Trine B
Department of Life Sciences and Health, Faculty of Health Sciences, Oslo Metropolitan University-OsloMet, Oslo, Norway.
Department of Microbiology and Infection Control, Akershus University Hospital, Lørenskog, Norway.
Sci Rep. 2025 Jul 2;15(1):23003. doi: 10.1038/s41598-025-05267-8.
Accurate detection of low-frequency mutations is crucial for understanding viral evolution and tumorigenesis in humans, but is often confounded by technical artifacts introduced during library preparation and sequencing. We present GENOMICON-Seq, an end-to-end simulation tool that models both amplicon and whole exome sequencing (WES) workflows with realistic biological mutations and technical noise. GENOMICON-Seq inserts ground truth mutations, ranging from APOBEC3-like edits to COSMIC single base substitution signatures, before subjecting samples to simulated PCR errors, probe-capture enrichment, and Illumina-specific sequencing biases. By tracking each mutation's origin (true or error-derived), researchers can pinpoint detection limits and optimize variant-calling thresholds. We illustrate GENOMICON-Seq's versatility through study cases involving human papillomavirus (HPV) amplicon sequencing, highlighting the impacts of polymerase fidelity, viral copy number, and read depth on detecting low-frequency mutations. In parallel, WES simulations demonstrate how capture biases and varying allele frequencies affect somatic mutation calls. GENOMICON-Seq is thus a flexible, reproducible framework for assessing new protocols, benchmarking variant callers, and refining data analysis pipelines, ultimately reducing costly trial-and-error in the laboratory. The Docker-based package is freely available at https://github.com/Rounge-lab/GENOMICON-Seq .
准确检测低频突变对于理解人类病毒进化和肿瘤发生至关重要,但常常受到文库制备和测序过程中引入的技术假象的干扰。我们展示了GENOMICON-Seq,这是一种端到端模拟工具,它利用真实的生物学突变和技术噪声对扩增子测序和全外显子组测序(WES)工作流程进行建模。GENOMICON-Seq在使样本经历模拟PCR错误、探针捕获富集和Illumina特定测序偏差之前,插入从APOBEC3样编辑到COSMIC单碱基替换特征等真实突变。通过追踪每个突变的起源(真实的或错误衍生的),研究人员可以确定检测限并优化变异调用阈值。我们通过涉及人乳头瘤病毒(HPV)扩增子测序的研究案例说明了GENOMICON-Seq的多功能性,突出了聚合酶保真度、病毒拷贝数和读长深度对检测低频突变的影响。同时,WES模拟展示了捕获偏差和不同等位基因频率如何影响体细胞突变调用。因此,GENOMICON-Seq是一个灵活、可重复的框架,用于评估新方案、对变异调用器进行基准测试以及完善数据分析流程,最终减少实验室中代价高昂的反复试验。基于Docker的软件包可在https://github.com/Rounge-lab/GENOMICON-Seq免费获取。