• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SomatoSim:体细胞单核苷酸变异的精确模拟。

SomatoSim: precision simulation of somatic single nucleotide variants.

机构信息

Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

出版信息

BMC Bioinformatics. 2021 Mar 6;22(1):109. doi: 10.1186/s12859-021-04024-8.

DOI:10.1186/s12859-021-04024-8
PMID:33676403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7936459/
Abstract

BACKGROUND

Somatic single nucleotide variants have gained increased attention because of their role in cancer development and the widespread use of high-throughput sequencing techniques. The necessity to accurately identify these variants in sequencing data has led to a proliferation of somatic variant calling tools. Additionally, the use of simulated data to assess the performance of these tools has become common practice, as there is no gold standard dataset for benchmarking performance. However, many existing somatic variant simulation tools are limited because they rely on generating entirely synthetic reads derived from a reference genome or because they do not allow for the precise customizability that would enable a more focused understanding of single nucleotide variant calling performance.

RESULTS

SomatoSim is a tool that lets users simulate somatic single nucleotide variants in sequence alignment map (SAM/BAM) files with full control of the specific variant positions, number of variants, variant allele fractions, depth of coverage, read quality, and base quality, among other parameters. SomatoSim accomplishes this through a three-stage process: variant selection, where candidate positions are selected for simulation, variant simulation, where reads are selected and mutated, and variant evaluation, where SomatoSim summarizes the simulation results.

CONCLUSIONS

SomatoSim is a user-friendly tool that offers a high level of customizability for simulating somatic single nucleotide variants. SomatoSim is available at https://github.com/BieseckerLab/SomatoSim .

摘要

背景

体细胞单核苷酸变异因其在癌症发展中的作用以及高通量测序技术的广泛应用而受到越来越多的关注。在测序数据中准确识别这些变体的必要性导致了大量体细胞变异调用工具的出现。此外,使用模拟数据来评估这些工具的性能已成为常见做法,因为没有用于基准性能的黄金标准数据集。然而,许多现有的体细胞变异模拟工具受到限制,因为它们依赖于生成完全来自参考基因组的合成读段,或者因为它们不允许进行精确的定制化,从而无法更深入地了解单核苷酸变异调用性能。

结果

SomaticSim 是一种工具,允许用户使用序列比对图(SAM/BAM)文件中完全控制特定变异位置、变异数量、变异等位基因分数、覆盖深度、读取质量和碱基质量等参数模拟体细胞单核苷酸变异。SomaticSim 通过三个阶段的过程来实现这一点:变异选择,在此阶段选择要模拟的候选位置;变异模拟,在此阶段选择和突变读取;变异评估,在此阶段 SomatoSim 总结模拟结果。

结论

SomaticSim 是一种用户友好的工具,提供了高度可定制的模拟体细胞单核苷酸变异的功能。SomaticSim 可在 https://github.com/BieseckerLab/SomatoSim 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/34d622634b94/12859_2021_4024_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/1e994f6a25f1/12859_2021_4024_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/d5dd4d94ead3/12859_2021_4024_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/34d622634b94/12859_2021_4024_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/1e994f6a25f1/12859_2021_4024_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/d5dd4d94ead3/12859_2021_4024_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0401/7936459/34d622634b94/12859_2021_4024_Fig3_HTML.jpg

相似文献

1
SomatoSim: precision simulation of somatic single nucleotide variants.SomatoSim:体细胞单核苷酸变异的精确模拟。
BMC Bioinformatics. 2021 Mar 6;22(1):109. doi: 10.1186/s12859-021-04024-8.
2
Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection.结合精确的肿瘤基因组模拟和众包基准测试体细胞结构变异检测。
Genome Biol. 2018 Nov 6;19(1):188. doi: 10.1186/s13059-018-1539-5.
3
SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing.SNooPer:一种基于机器学习从低深度下一代测序中识别体细胞变异的方法。
BMC Genomics. 2016 Nov 14;17(1):912. doi: 10.1186/s12864-016-3281-2.
4
RDscan: A New Method for Improving Germline and Somatic Variant Calling Based on Read Depth Distribution.RDscan:一种基于读深度分布的提高种系和体细胞变异calling 的新方法。
J Comput Biol. 2022 Sep;29(9):987-1000. doi: 10.1089/cmb.2021.0269. Epub 2022 Jun 24.
5
Calling known variants and identifying new variants while rapidly aligning sequence data.在快速对齐序列数据的同时,调用已知变异体并识别新变异体。
J Dairy Sci. 2019 Apr;102(4):3216-3229. doi: 10.3168/jds.2018-15172. Epub 2019 Feb 14.
6
Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets.Bamgineer:外显子组和靶向序列数据集模拟等位基因特异性拷贝数变异的引入。
PLoS Comput Biol. 2018 Mar 28;14(3):e1006080. doi: 10.1371/journal.pcbi.1006080. eCollection 2018 Mar.
7
SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing.SECEDO:基于 SNV 的亚克隆检测,使用超低覆盖度单细胞 DNA 测序。
Bioinformatics. 2022 Sep 15;38(18):4293-4300. doi: 10.1093/bioinformatics/btac510.
8
Integrated approach to generate artificial samples with low tumor fraction for somatic variant calling benchmarking.综合方法生成低肿瘤分数的人工样本用于体细胞变异calling 基准测试。
BMC Bioinformatics. 2024 May 8;25(1):180. doi: 10.1186/s12859-024-05793-8.
9
Fast read alignment with incorporation of known genomic variants.快速读取与已知基因组变异的整合。
BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):265. doi: 10.1186/s12911-019-0960-3.
10
QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles.QQ-SNV:通过比较质量分位数进行低频单核苷酸变异检测
BMC Bioinformatics. 2015 Nov 10;16:379. doi: 10.1186/s12859-015-0812-9.

引用本文的文献

1
GENOMICON-Seq enables realistic simulation of amplicon and exome sequencing for low-frequency mutation detection.GENOMICON-Seq能够对扩增子和外显子测序进行逼真模拟,以检测低频突变。
Sci Rep. 2025 Jul 2;15(1):23003. doi: 10.1038/s41598-025-05267-8.
2
Low mutation rate in epaulette sharks is consistent with a slow rate of evolution in sharks.鳍脚鲨鱼的低突变率与鲨鱼的进化缓慢相一致。
Nat Commun. 2023 Oct 19;14(1):6628. doi: 10.1038/s41467-023-42238-x.
3
Genomic variant benchmark: if you cannot measure it, you cannot improve it.

本文引用的文献

1
COSMIC: the Catalogue Of Somatic Mutations In Cancer.COSMIC:癌症体细胞突变目录。
Nucleic Acids Res. 2019 Jan 8;47(D1):D941-D947. doi: 10.1093/nar/gky1015.
2
GENCODE reference annotation for the human and mouse genomes.GENCODE 人类和小鼠基因组参考注释。
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773. doi: 10.1093/nar/gky955.
3
Xome-Blender: A novel cancer genome simulator.Xome-Blender:一种新型癌症基因组模拟器。
基因组变异基准:如果无法衡量,就无法改进。
Genome Biol. 2023 Oct 5;24(1):221. doi: 10.1186/s13059-023-03061-1.
4
Boquila: NGS read simulator to eliminate read nucleotide bias in sequence analysis.Boquila:用于消除序列分析中读取核苷酸偏差的二代测序读段模拟器。
Turk J Biol. 2023 Feb 21;47(2):158-163. doi: 10.55730/1300-0152.2650. eCollection 2023.
5
Accelerating genomic workflows using NVIDIA Parabricks.利用 NVIDIA Parabricks 加速基因组工作流程。
BMC Bioinformatics. 2023 May 31;24(1):221. doi: 10.1186/s12859-023-05292-2.
6
MQuad enables clonal substructure discovery using single cell mitochondrial variants.MQuad 利用单细胞线粒体变体实现克隆亚结构发现。
Nat Commun. 2022 Mar 8;13(1):1205. doi: 10.1038/s41467-022-28845-0.
7
Low-level variant calling for non-matched samples using a position-based and nucleotide-specific approach.基于位置和核苷酸特异性的非配对样本低水平变异调用。
BMC Bioinformatics. 2021 Apr 8;22(1):181. doi: 10.1186/s12859-021-04090-y.
PLoS One. 2018 Apr 5;13(4):e0194472. doi: 10.1371/journal.pone.0194472. eCollection 2018.
4
A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data.用于下一代测序数据的体细胞单核苷酸变异检测算法综述。
Comput Struct Biotechnol J. 2018 Feb 6;16:15-24. doi: 10.1016/j.csbj.2018.01.003. eCollection 2018.
5
tHapMix: simulating tumour samples through haplotype mixtures.tHapMix:通过单倍型混合物模拟肿瘤样本。
Bioinformatics. 2017 Jan 15;33(2):280-282. doi: 10.1093/bioinformatics/btw589. Epub 2016 Sep 7.
6
Extensive sequencing of seven human genomes to characterize benchmark reference materials.对七个人类基因组进行广泛测序以表征基准参考材料。
Sci Data. 2016 Jun 7;3:160025. doi: 10.1038/sdata.2016.25.
7
Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.将肿瘤基因组模拟与众包相结合,以评估体细胞单核苷酸变异检测。
Nat Methods. 2015 Jul;12(7):623-30. doi: 10.1038/nmeth.3407. Epub 2015 May 18.
8
VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.VarSim:一个用于癌症相关高通量基因组测序的高保真模拟与验证框架。
Bioinformatics. 2015 May 1;31(9):1469-71. doi: 10.1093/bioinformatics/btu828. Epub 2014 Dec 17.
9
SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.SInC:一种准确且快速的基于错误模型的 SNP、Indel 和 CNV 模拟器,结合了用于短读序列数据的读取生成器。
BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.
10
A genomic view of mosaicism and human disease.人类疾病的嵌合体现象与基因组研究
Nat Rev Genet. 2013 May;14(5):307-20. doi: 10.1038/nrg3424.