全基因组癌症数据中特定部位中性体细胞突变率的模型和分析。

A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data.

机构信息

Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark.

Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, Aarhus C, DK-8000, Denmark.

出版信息

BMC Bioinformatics. 2018 Apr 19;19(1):147. doi: 10.1186/s12859-018-2141-2.

DOI:10.1186/s12859-018-2141-2

PMID:29673314

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5909259/

Abstract

BACKGROUND

Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation rate differs between cancer types, between patients and along the genome depending on the genetic and epigenetic context. Therefore, methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. A major drawback of most methods is the need to average the explanatory variables across the entire region or genomic element. This procedure is particularly problematic if the explanatory variable varies dramatically in the element under consideration.

RESULTS

To take into account the fine scale of the explanatory variables, we model the probabilities of different types of mutations for each position in the genome by multinomial logistic regression. We analyse 505 cancer genomes from 14 different cancer types and compare the performance in predicting mutation rate for both regional based models and site-specific models. We show that for 1000 randomly selected genomic positions, the site-specific model predicts the mutation rate much better than regional based models. We use a forward selection procedure to identify the most important explanatory variables. The procedure identifies site-specific conservation (phyloP), replication timing, and expression level as the best predictors for the mutation rate. Finally, our model confirms and quantifies certain well-known mutational signatures.

CONCLUSION

We find that our site-specific multinomial regression model outperforms the regional based models. The possibility of including genomic variables on different scales and patient specific variables makes it a versatile framework for studying different mutational mechanisms. Our model can serve as the neutral null model for the mutational process; regions that deviate from the null model are candidates for elements that drive cancer development.

摘要

背景

详细模拟癌细胞中的中性突变过程对于识别驱动突变和理解癌症发生过程中起作用的突变机制至关重要。中性突变过程非常复杂：全基因组分析表明，突变率因癌症类型、患者和基因组而异，这取决于遗传和表观遗传背景。因此，预测基因组区域或特定基因组元件中不同类型突变数量的方法必须考虑局部基因组解释变量。大多数方法的一个主要缺点是需要在整个区域或基因组元件上平均解释变量。如果考虑中的解释变量在元素中变化很大，那么该程序尤其成问题。

结果

为了考虑解释变量的精细尺度，我们通过多项逻辑回归为基因组中的每个位置建模不同类型突变的概率。我们分析了来自 14 种不同癌症类型的 505 个癌症基因组，并比较了区域模型和特定部位模型在预测突变率方面的性能。我们表明，对于 1000 个随机选择的基因组位置，特定部位模型比基于区域的模型更好地预测突变率。我们使用前向选择程序来识别最重要的解释变量。该程序确定特定部位的保守性（phyloP）、复制时间和表达水平是突变率的最佳预测因子。最后，我们的模型证实并量化了某些已知的突变特征。

结论

我们发现我们的特定部位多项回归模型优于基于区域的模型。包含不同尺度的基因组变量和患者特定变量的可能性使其成为研究不同突变机制的通用框架。我们的模型可以作为突变过程的中性零模型；偏离零模型的区域是驱动癌症发展的候选元素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e9b3/5909259/814344d02fa3/12859_2018_2141_Fig1_HTML.jpg

相似文献

A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data.

BMC Bioinformatics. 2018 Apr 19;19(1):147. doi: 10.1186/s12859-018-2141-2.

Functional and genetic determinants of mutation rate variability in regulatory elements of cancer genomes.

Genome Biol. 2021 May 3;22(1):133. doi: 10.1186/s13059-021-02318-x.

Identification of coding and non-coding mutational hotspots in cancer genomes.

BMC Genomics. 2017 Jan 5;18(1):17. doi: 10.1186/s12864-016-3420-9.

Cell-of-origin chromatin organization shapes the mutational landscape of cancer.

Nature. 2015 Feb 19;518(7539):360-364. doi: 10.1038/nature14221.

NIMBus: a negative binomial regression based Integrative Method for mutation Burden Analysis.

BMC Bioinformatics. 2020 Oct 22;21(1):474. doi: 10.1186/s12859-020-03758-1.

Mutalisk: a web-based somatic MUTation AnaLyIS toolKit for genomic, transcriptional and epigenomic signatures.

Nucleic Acids Res. 2018 Jul 2;46(W1):W102-W108. doi: 10.1093/nar/gky406.

A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures.

PLoS Genet. 2015 Dec 2;11(12):e1005657. doi: 10.1371/journal.pgen.1005657. eCollection 2015 Dec.

Genome-wide mutational spectra analysis reveals significant cancer-specific heterogeneity.

Sci Rep. 2015 Jul 27;5:12566. doi: 10.1038/srep12566.

Hotspot mutations delineating diverse mutational signatures and biological utilities across cancer types.

BMC Genomics. 2016 Jun 23;17 Suppl 2(Suppl 2):394. doi: 10.1186/s12864-016-2727-x.

Frequent mutations in acetylation and ubiquitination sites suggest novel driver mechanisms of cancer.

Genome Med. 2016 May 12;8(1):55. doi: 10.1186/s13073-016-0311-2.

引用本文的文献

Sequence dependencies and mutation rates of localized mutational processes in cancer.

Genome Med. 2023 Aug 17;15(1):63. doi: 10.1186/s13073-023-01217-z.

A narrative review of prognosis prediction models for non-small cell lung cancer: what kind of predictors should be selected and how to improve models?

Ann Transl Med. 2021 Oct;9(20):1597. doi: 10.21037/atm-21-4733.

The landscape and driver potential of site-specific hotspots across cancer genomes.

NPJ Genom Med. 2021 May 13;6(1):33. doi: 10.1038/s41525-021-00197-6.

Untangling a complex web: Computational analyses of tumor molecular profiles to decode driver mechanisms.

J Genet Genomics. 2020 Oct 20;47(10):595-609. doi: 10.1016/j.jgg.2020.11.001. Epub 2020 Nov 28.

ncdDetect2: improved models of the site-specific mutation rate in cancer and driver detection with robust significance evaluation.

Bioinformatics. 2019 Jan 15;35(2):189-199. doi: 10.1093/bioinformatics/bty511.

Non-coding cancer driver candidates identified with a sample- and position-specific model of the somatic mutation rate.

Elife. 2017 Mar 31;6:e21778. doi: 10.7554/eLife.21778.

本文引用的文献

Non-coding cancer driver candidates identified with a sample- and position-specific model of the somatic mutation rate.

Elife. 2017 Mar 31;6:e21778. doi: 10.7554/eLife.21778.

Nucleotide excision repair is impaired by binding of transcription factors to DNA.

Nature. 2016 Apr 14;532(7598):264-7. doi: 10.1038/nature17661.

Clock-like mutational processes in human somatic cells.

Nat Genet. 2015 Dec;47(12):1402-7. doi: 10.1038/ng.3441. Epub 2015 Nov 9.

LARVA: an integrative framework for large-scale analysis of recurrent variants in noncoding annotations.

Nucleic Acids Res. 2015 Sep 30;43(17):8123-34. doi: 10.1093/nar/gkv803. Epub 2015 Aug 24.

Recurrent somatic mutations in regulatory regions of human cancer genomes.

Nat Genet. 2015 Jul;47(7):710-6. doi: 10.1038/ng.3332. Epub 2015 Jun 8.

Cell-of-origin chromatin organization shapes the mutational landscape of cancer.

Nature. 2015 Feb 19;518(7539):360-364. doi: 10.1038/nature14221.

Integrative analysis of 111 reference human epigenomes.

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

Interactions of chromatin context, binding site sequence content, and sequence evolution in stress-induced p53 occupancy and transactivation.

PLoS Genet. 2015 Jan 8;11(1):e1004885. doi: 10.1371/journal.pgen.1004885. eCollection 2015 Jan.

Systematic analysis of noncoding somatic mutations and gene expression alterations across 14 tumor types.

Nat Genet. 2014 Dec;46(12):1258-63. doi: 10.1038/ng.3141. Epub 2014 Nov 10.

Exonuclease mutations in DNA polymerase epsilon reveal replication strand specific mutation patterns and human origins of replication.

Genome Res. 2014 Nov;24(11):1740-50. doi: 10.1101/gr.174789.114. Epub 2014 Sep 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

全基因组癌症数据中特定部位中性体细胞突变率的模型和分析。

A site specific model and analysis of the neutral somatic mutation rate in whole-genome cancer data.

机构信息

Department of Molecular Medicine, Aarhus University, Palle Juul-Jensens Boulevard 99, Aarhus N, DK-8200, Denmark.

Bioinformatics Research Centre, Aarhus University, C.F. Mollers Alle 8, Aarhus C, DK-8000, Denmark.