• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

mSigHdp:用于突变特征发现的层次狄利克雷过程混合建模

mSigHdp: hierarchical Dirichlet process mixture modeling for mutational signature discovery.

作者信息

Liu Mo, Wu Yang, Jiang Nanhai, Boot Arnoud, Rozen Steven G

机构信息

Programme in Cancer & Stem Cell Biology, Duke-NUS Medical School, 169857 Singapore.

Centre for Computational Biology, Duke-NUS Medical School, 169857 Singapore.

出版信息

NAR Genom Bioinform. 2023 Jan 23;5(1):lqad005. doi: 10.1093/nargab/lqad005. eCollection 2023 Mar.

DOI:10.1093/nargab/lqad005
PMID:36694663
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9869330/
Abstract

Mutational signatures are characteristic patterns of mutations caused by endogenous or exogenous mutational processes. These signatures can be discovered by analyzing mutations in large sets of samples-usually somatic mutations in tumor samples. Most programs for discovering mutational signatures are based on non-negative matrix factorization (NMF). Alternatively, signatures can be discovered using hierarchical Dirichlet process (HDP) mixture models, an approach that has been less explored. These models assign mutations to clusters and view each cluster as being generated from the signature of a particular mutational process. Here, we describe mSigHdp, an improved approach to using HDP mixture models to discover mutational signatures. We benchmarked mSigHdp and state-of-the-art NMF-based approaches on four realistic synthetic data sets. These data sets encompassed 18 cancer types. In total, they contained 3.5 × 10 single-base-substitution mutations representing 32 signatures and 6.1 × 10 small insertion and deletion mutations representing 13 signatures. For three of the four data sets, mSigHdp had the best positive predictive value for discovering mutational signatures, and for all four data sets, it had the best true positive rate. Its CPU usage was similar to that of the NMF-based approaches. Thus, mSigHdp is an important and practical addition to the set of tools available for discovering mutational signatures.

摘要

突变特征是由内源性或外源性突变过程引起的突变特征模式。这些特征可以通过分析大量样本中的突变来发现——通常是肿瘤样本中的体细胞突变。大多数发现突变特征的程序都基于非负矩阵分解(NMF)。另外,也可以使用层次狄利克雷过程(HDP)混合模型来发现特征,这种方法的探索较少。这些模型将突变分配到不同簇,并将每个簇视为由特定突变过程的特征产生的。在这里,我们描述了mSigHdp,这是一种使用HDP混合模型发现突变特征的改进方法。我们在四个逼真的合成数据集上对mSigHdp和基于NMF的先进方法进行了基准测试。这些数据集涵盖了18种癌症类型。它们总共包含3.5×10个单碱基替换突变,代表32个特征,以及6.1×10个小插入和缺失突变,代表13个特征。对于四个数据集中的三个,mSigHdp在发现突变特征方面具有最佳的阳性预测值,对于所有四个数据集,它具有最佳的真阳性率。其CPU使用率与基于NMF的方法相似。因此,mSigHdp是可用于发现突变特征的工具集的一个重要且实用的补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/0041ec6acc9d/lqad005fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/f81830b90530/lqad005fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/06a5e7aeff14/lqad005fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/9bfb6d7d495a/lqad005fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/91829f15ad71/lqad005fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/e20f20306c61/lqad005fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/c55b021492a6/lqad005fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/0041ec6acc9d/lqad005fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/f81830b90530/lqad005fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/06a5e7aeff14/lqad005fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/9bfb6d7d495a/lqad005fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/91829f15ad71/lqad005fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/e20f20306c61/lqad005fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/c55b021492a6/lqad005fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e43/9869330/0041ec6acc9d/lqad005fig7.jpg

相似文献

1
mSigHdp: hierarchical Dirichlet process mixture modeling for mutational signature discovery.mSigHdp:用于突变特征发现的层次狄利克雷过程混合建模
NAR Genom Bioinform. 2023 Jan 23;5(1):lqad005. doi: 10.1093/nargab/lqad005. eCollection 2023 Mar.
2
Accuracy of mutational signature software on correlated signatures.突变特征软件在相关特征上的准确性。
Sci Rep. 2022 Jan 10;12(1):390. doi: 10.1038/s41598-021-04207-6.
3
Flexible model-based non-negative matrix factorization with application to mutational signatures.基于灵活模型的非负矩阵分解及其在突变特征中的应用。
Stat Appl Genet Mol Biol. 2024 May 16;23(1). doi: 10.1515/sagmb-2023-0034. eCollection 2024 Jan 1.
4
Model selection and robust inference of mutational signatures using Negative Binomial non-negative matrix factorization.使用负二项式非负矩阵分解进行突变特征的模型选择和稳健推断。
BMC Bioinformatics. 2023 May 8;24(1):187. doi: 10.1186/s12859-023-05304-1.
5
The repertoire of mutational signatures in human cancer.人类癌症中的突变特征谱。
Nature. 2020 Feb;578(7793):94-101. doi: 10.1038/s41586-020-1943-3. Epub 2020 Feb 5.
6
Discovering novel mutation signatures by latent Dirichlet allocation with variational Bayes inference.利用变分贝叶斯推断的潜在狄利克雷分配发现新的突变特征。
Bioinformatics. 2019 Nov 1;35(22):4543-4552. doi: 10.1093/bioinformatics/btz266.
7
Two subtypes of cutaneous melanoma with distinct mutational signatures and clinico-genomic characteristics.具有不同突变特征和临床基因组特征的两种皮肤黑色素瘤亚型。
Front Genet. 2022 Sep 29;13:987205. doi: 10.3389/fgene.2022.987205. eCollection 2022.
8
Modeling clinical and molecular covariates of mutational process activity in cancer.对癌症中突变过程活性的临床和分子协变量进行建模。
Bioinformatics. 2019 Jul 15;35(14):i492-i500. doi: 10.1093/bioinformatics/btz340.
9
Mutational signatures in colon cancer.结肠癌中的突变特征。
BMC Res Notes. 2019 Dec 3;12(1):788. doi: 10.1186/s13104-019-4820-0.
10
HiLDA: a statistical approach to investigate differences in mutational signatures.HiLDA:一种研究突变特征差异的统计方法。
PeerJ. 2019 Aug 28;7:e7557. doi: 10.7717/peerj.7557. eCollection 2019.

引用本文的文献

1
The long-term effects of chemotherapy on normal blood cells.化疗对正常血细胞的长期影响。
Nat Genet. 2025 Jul 1. doi: 10.1038/s41588-025-02234-x.
2
Geographic and age variations in mutational processes in colorectal cancer.结直肠癌突变过程中的地理和年龄差异。
Nature. 2025 Apr 23. doi: 10.1038/s41586-025-09025-8.
3
The complexity of tobacco smoke-induced mutagenesis in head and neck cancer.烟草烟雾诱发头颈部癌症中诱变作用的复杂性。

本文引用的文献

1
Uncovering novel mutational signatures by extraction with SigProfilerExtractor.通过SigProfilerExtractor提取来揭示新的突变特征。
Cell Genom. 2022 Nov 9;2(11):None. doi: 10.1016/j.xgen.2022.100179.
2
Substitution mutational signatures in whole-genome-sequenced cancers in the UK population.英国人群全基因组测序癌症中的取代突变特征。
Science. 2022 Apr 22;376(6591). doi: 10.1126/science.abl9283.
3
MutationalPatterns: the one stop shop for the analysis of mutational processes.突变模式:分析突变过程的一站式商店。
Nat Genet. 2025 Apr;57(4):884-896. doi: 10.1038/s41588-025-02134-0. Epub 2025 Mar 31.
4
Geographic and age-related variations in mutational processes in colorectal cancer.结直肠癌突变过程中的地理和年龄相关变异。
medRxiv. 2025 Feb 21:2025.02.13.25322219. doi: 10.1101/2025.02.13.25322219.
5
Benchmarking 13 tools for mutational signature attribution, including a new and improved algorithm.对13种用于突变特征归因的工具进行基准测试,包括一种新的改进算法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf042.
6
An Expanding Universe of Mutational Signatures and Its Rapid Evolution in Single-Stranded RNA Viruses.单链RNA病毒中突变特征的不断扩展及其快速进化
Mol Biol Evol. 2025 Feb 3;42(2). doi: 10.1093/molbev/msaf009.
7
Geographic variation of mutagenic exposures in kidney cancer genomes.肾癌基因组中诱变暴露的地理变异。
Nature. 2024 May;629(8013):910-918. doi: 10.1038/s41586-024-07368-2. Epub 2024 May 1.
8
Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival.基于多组学的特征提取与选择在肺癌生存预测中的应用。
Int J Mol Sci. 2024 Mar 25;25(7):3661. doi: 10.3390/ijms25073661.
9
scGEM: Unveiling the Nested Tree-Structured Gene Co-Expressing Modules in Single Cell Transcriptome Data.scGEM:揭示单细胞转录组数据中嵌套树状结构的基因共表达模块
Cancers (Basel). 2023 Aug 26;15(17):4277. doi: 10.3390/cancers15174277.
BMC Genomics. 2022 Feb 15;23(1):134. doi: 10.1186/s12864-022-08357-3.
4
Recurrent mutations in topoisomerase IIα cause a previously undescribed mutator phenotype in human cancers.拓扑异构酶 IIα 的反复突变导致人类癌症中以前未描述的诱变表型。
Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2114024119.
5
Accuracy of mutational signature software on correlated signatures.突变特征软件在相关特征上的准确性。
Sci Rep. 2022 Jan 10;12(1):390. doi: 10.1038/s41598-021-04207-6.
6
MSA: reproducible mutational signature attribution with confidence based on simulations.MSA:基于模拟具有置信度的可重复突变特征归因。
BMC Bioinformatics. 2021 Nov 4;22(1):540. doi: 10.1186/s12859-021-04450-8.
7
mmsig: a fitting approach to accurately identify somatic mutational signatures in hematological malignancies.mmsig:一种准确识别血液系统恶性肿瘤中体细胞突变特征的合适方法。
Commun Biol. 2021 Mar 29;4(1):424. doi: 10.1038/s42003-021-01938-0.
8
Analysis of mutational signatures with yet another package for signature analysis.使用另一个签名分析包分析突变特征。
Genes Chromosomes Cancer. 2021 May;60(5):314-331. doi: 10.1002/gcc.22918. Epub 2020 Dec 31.
9
Characterization of colibactin-associated mutational signature in an Asian oral squamous cell carcinoma and in other mucosal tumor types.亚洲口腔鳞状细胞癌和其他黏膜肿瘤类型中与 colibactin 相关的突变特征的表征。
Genome Res. 2020 Jun;30(6):803-813. doi: 10.1101/gr.255620.119. Epub 2020 Jul 6.
10
A practical framework and online tool for mutational signature analyses show inter-tissue variation and driver dependencies.一个用于突变特征分析的实用框架和在线工具显示了组织间的变异和驱动依赖性。
Nat Cancer. 2020 Feb;1(2):249-263. doi: 10.1038/s43018-020-0027-5. Epub 2020 Feb 17.