• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全面分析癌症断点揭示了遗传和表观遗传对癌症基因组重排的贡献特征。

Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements.

机构信息

Laboratory of Bioinformatics, Faculty of Computer Science, National Research University Higher School of Economics, Moscow, Russia.

Faculty of Digital Transformation, ITMO University, St. Petersburg, Russia.

出版信息

PLoS Comput Biol. 2021 Mar 1;17(3):e1008749. doi: 10.1371/journal.pcbi.1008749. eCollection 2021 Mar.

DOI:10.1371/journal.pcbi.1008749
PMID:33647036
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7951985/
Abstract

Understanding mechanisms of cancer breakpoint mutagenesis is a difficult task and predictive models of cancer breakpoint formation have to this time failed to achieve even moderate predictive power. Here we take advantage of a machine learning approach that can gather important features from big data and quantify contribution of different factors. We performed comprehensive analysis of almost 630,000 cancer breakpoints and quantified the contribution of genomic and epigenomic features-non-B DNA structures, chromatin organization, transcription factor binding sites and epigenetic markers. The results showed that transcription and formation of non-B DNA structures are two major processes responsible for cancer genome fragility. Epigenetic factors, such as chromatin organization in TADs, open/closed regions, DNA methylation, histone marks are less informative but do make their contribution. As a general trend, individual features inside the groups show a relatively high contribution of G-quadruplexes and repeats and CTCF, GABPA, RXRA, SP1, MAX and NR2F2 transcription factors. Overall, the cancer breakpoint landscape can be represented by well-predicted hotspots and poorly predicted individual breakpoints scattered across genomes. We demonstrated that hotspot mutagenesis has genomic and epigenomic factors, and not all individual cancer breakpoints are just random noise but have a definite mutation signature. Besides we found a long-range action of some features on breakpoint mutagenesis. Combining omics data, cancer-specific individual feature importance and adding the distant to local features, predictive models for cancer breakpoint formation achieved 70-90% ROC AUC for different cancer types; however precision remained low at 2% and the recall did not exceed 50%. On the one hand, the power of models strongly correlates with the size of available cancer breakpoint and epigenomic data, and on the other hand finding strong determinants of cancer breakpoint formation still remains a challenge. The strength of predictive signals of each group and of each feature inside a group can be converted into cancer-specific breakpoint mutation signatures. Overall our results add to the understanding of cancer genome rearrangement processes.

摘要

理解癌症断裂点突变的机制是一项艰巨的任务,到目前为止,预测癌症断裂点形成的模型甚至未能达到中等的预测能力。在这里,我们利用一种机器学习方法,可以从大数据中收集重要特征,并量化不同因素的贡献。我们对近 630,000 个癌症断裂点进行了全面分析,并量化了基因组和表观基因组特征(非 B 型 DNA 结构、染色质组织、转录因子结合位点和表观遗传标记)的贡献。结果表明,转录和非 B 型 DNA 结构的形成是导致癌症基因组脆弱性的两个主要过程。表观遗传因素,如 TAD 中的染色质组织、开放/关闭区域、DNA 甲基化、组蛋白标记的信息较少,但确实有其贡献。一般来说,组内的单个特征显示出相对较高的 G-四联体和重复序列以及 CTCF、GABPA、RXRA、SP1、MAX 和 NR2F2 转录因子的贡献。总的来说,癌症断裂点景观可以由预测良好的热点和散布在基因组中的预测不良的个别断裂点来表示。我们证明了热点突变具有基因组和表观基因组因素,并非所有个体癌症断裂点只是随机噪声,而是具有明确的突变特征。此外,我们发现一些特征对断裂点突变具有长程作用。结合组学数据、癌症特异性个体特征重要性以及添加远距离到局部特征,癌症断裂点形成的预测模型在不同癌症类型中达到了 70-90%的 ROC AUC;然而,精度仍然很低,为 2%,召回率不超过 50%。一方面,模型的强大程度与可用的癌症断裂点和表观基因组数据的大小密切相关,另一方面,找到癌症断裂点形成的强决定因素仍然是一个挑战。每个组和组内每个特征的预测信号的强度可以转化为癌症特异性的断裂点突变特征。总的来说,我们的结果增加了对癌症基因组重排过程的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/db673b9601ef/pcbi.1008749.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/bb2a41bc41da/pcbi.1008749.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/7dfd5aeb751c/pcbi.1008749.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/bb25d2c280de/pcbi.1008749.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/5096350cd87e/pcbi.1008749.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/d45c26604028/pcbi.1008749.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/db673b9601ef/pcbi.1008749.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/bb2a41bc41da/pcbi.1008749.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/7dfd5aeb751c/pcbi.1008749.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/bb25d2c280de/pcbi.1008749.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/5096350cd87e/pcbi.1008749.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/d45c26604028/pcbi.1008749.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3634/7951985/db673b9601ef/pcbi.1008749.g006.jpg

相似文献

1
Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements.全面分析癌症断点揭示了遗传和表观遗传对癌症基因组重排的贡献特征。
PLoS Comput Biol. 2021 Mar 1;17(3):e1008749. doi: 10.1371/journal.pcbi.1008749. eCollection 2021 Mar.
2
Tissue-specific impact of stem-loops and quadruplexes on cancer breakpoints formation.茎环和四链体对癌症断点形成的组织特异性影响。
BMC Cancer. 2019 May 10;19(1):434. doi: 10.1186/s12885-019-5653-x.
3
Randomness in Cancer Breakpoint Prediction.癌症断点预测中的随机性
J Comput Biol. 2021 Jul;28(7):716-731. doi: 10.1089/cmb.2020.0551. Epub 2021 Jun 15.
4
R-loops and regulatory changes in chronologically ageing fission yeast cells drive non-random patterns of genome rearrangements.R 环和时序老化的裂殖酵母细胞中的调控变化驱动基因组重排的非随机模式。
PLoS Genet. 2021 Aug 31;17(8):e1009784. doi: 10.1371/journal.pgen.1009784. eCollection 2021 Aug.
5
Mutation Signatures Depend on Epigenomic Contexts.突变特征取决于表观基因组背景。
Trends Cancer. 2018 Oct;4(10):659-661. doi: 10.1016/j.trecan.2018.08.001. Epub 2018 Sep 6.
6
Comparative analysis of somatic copy-number alterations across different human cancer types reveals two distinct classes of breakpoint hotspots.对不同人类癌症类型的体细胞拷贝数改变进行比较分析,揭示了两个不同类别的断点热点。
Hum Mol Genet. 2012 Nov 15;21(22):4957-65. doi: 10.1093/hmg/dds340. Epub 2012 Aug 16.
7
Cell-of-origin chromatin organization shapes the mutational landscape of cancer.起源细胞染色质组织塑造了癌症的突变景观。
Nature. 2015 Feb 19;518(7539):360-364. doi: 10.1038/nature14221.
8
Analysis of deletion breakpoints from 1,092 humans reveals details of mutation mechanisms.对1092名人类的缺失断点进行分析,揭示了突变机制的细节。
Nat Commun. 2015 Jun 1;6:7256. doi: 10.1038/ncomms8256.
9
Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms.64 例癌症基因组的断点分析揭示了许多由同源非依赖性机制产生的复杂重排。
Genome Res. 2013 May;23(5):762-76. doi: 10.1101/gr.143677.112. Epub 2013 Feb 14.
10
Cancer chromosome breakpoints cluster in gene-rich genomic regions.癌症染色体断裂点聚集在基因丰富的基因组区域。
Genes Chromosomes Cancer. 2019 Mar;58(3):149-154. doi: 10.1002/gcc.22713. Epub 2018 Dec 26.

引用本文的文献

1
Late steps of allelic break-induced replication suppress tandem duplication associated with BRCA1 deficiency.等位基因断裂诱导复制的后期步骤抑制与BRCA1缺陷相关的串联重复。
Nucleic Acids Res. 2025 Jul 19;53(14). doi: 10.1093/nar/gkaf729.
2
Non-canonical DNA in human and other ape telomere-to-telomere genomes.人类及其他猿类端粒到端粒基因组中的非规范DNA。
Nucleic Acids Res. 2025 Apr 10;53(7). doi: 10.1093/nar/gkaf298.
3
Non-canonical DNA in human and other ape telomere-to-telomere genomes.人类及其他猿类端粒到端粒基因组中的非规范DNA。

本文引用的文献

1
Distinct mechanisms of mutagenic processing of alternative DNA structures by repair proteins.修复蛋白对替代性DNA结构进行诱变处理的不同机制。
Mol Cell Oncol. 2020 Apr 2;7(3):1743807. doi: 10.1080/23723556.2020.1743807. eCollection 2020.
2
Patterns of somatic structural variation in human cancer genomes.人类癌症基因组中体结构变异的模式。
Nature. 2020 Feb;578(7793):112-121. doi: 10.1038/s41586-019-1913-9. Epub 2020 Feb 5.
3
Pan-cancer analysis of whole genomes.泛癌症全基因组分析。
bioRxiv. 2025 Mar 8:2024.09.02.610891. doi: 10.1101/2024.09.02.610891.
4
A Phenotypic Approach to the Discovery of Potent G-Quadruplex Targeted Drugs.一种表型方法,用于发现有效的 G-四链体靶向药物。
Molecules. 2024 Aug 1;29(15):3653. doi: 10.3390/molecules29153653.
5
Cannabis- and Substance-Related Carcinogenesis in Europe: A Lagged Causal Inferential Panel Regression Study.欧洲大麻及物质相关致癌作用:一项滞后因果推断面板回归研究
J Xenobiot. 2023 Jul 18;13(3):323-385. doi: 10.3390/jox13030024.
6
G-Quadruplex Structures Are Key Modulators of Somatic Structural Variants in Cancers.G-四链体结构是癌症中体细胞结构变异的关键调节剂。
Cancer Res. 2023 Apr 14;83(8):1234-1248. doi: 10.1158/0008-5472.CAN-22-3089.
7
Noncanonical DNA structures are drivers of genome evolution.非规范 DNA 结构是基因组进化的驱动因素。
Trends Genet. 2023 Feb;39(2):109-124. doi: 10.1016/j.tig.2022.11.005. Epub 2023 Jan 3.
8
Dynamic alternative DNA structures in biology and disease.生物学和疾病中的动态替代性DNA结构。
Nat Rev Genet. 2023 Apr;24(4):211-234. doi: 10.1038/s41576-022-00539-9. Epub 2022 Oct 31.
9
Modeling tissue-specific breakpoint proximity of structural variations from whole-genomes to identify cancer drivers.从全基因组水平构建组织特异性断点邻近模型以鉴定癌症驱动因子
Nat Commun. 2022 Sep 26;13(1):5640. doi: 10.1038/s41467-022-32945-2.
10
Replication dependent and independent mechanisms of GAA repeat instability.GAA 重复不稳定的复制依赖和非依赖机制。
DNA Repair (Amst). 2022 Oct;118:103385. doi: 10.1016/j.dnarep.2022.103385. Epub 2022 Aug 3.
Nature. 2020 Feb;578(7793):82-93. doi: 10.1038/s41586-020-1969-6. Epub 2020 Feb 5.
4
COUP-TFII in Health and Disease.COUP-TFII 在健康与疾病中的作用。
Cells. 2019 Dec 31;9(1):101. doi: 10.3390/cells9010101.
5
GABPA is a master regulator of luminal identity and restrains aggressive diseases in bladder cancer.GABPA 是管腔身份的主要调节因子,可抑制膀胱癌的侵袭性疾病。
Cell Death Differ. 2020 Jun;27(6):1862-1877. doi: 10.1038/s41418-019-0466-7. Epub 2019 Dec 4.
6
Human-specific tandem repeat expansion and differential gene expression during primate evolution.人类特有的串联重复扩展和灵长类动物进化过程中的差异基因表达。
Proc Natl Acad Sci U S A. 2019 Nov 12;116(46):23243-23253. doi: 10.1073/pnas.1912175116. Epub 2019 Oct 28.
7
Distinct Molecular Mechanisms Analysis of Three Lung Cancer Subtypes Based on Gene Expression Profiles.
J Comput Biol. 2019 Oct;26(10):1140-1155. doi: 10.1089/cmb.2019.0046. Epub 2019 Jul 15.
8
Tissue-specific impact of stem-loops and quadruplexes on cancer breakpoints formation.茎环和四链体对癌症断点形成的组织特异性影响。
BMC Cancer. 2019 May 10;19(1):434. doi: 10.1186/s12885-019-5653-x.
9
CTCF: a Swiss-army knife for genome organization and transcription regulation.CTCF:基因组组织和转录调控的瑞士军刀。
Essays Biochem. 2019 Apr 23;63(1):157-165. doi: 10.1042/EBC20180069.
10
Local Determinants of the Mutational Landscape of the Human Genome.人类基因组突变景观的局部决定因素。
Cell. 2019 Mar 21;177(1):101-114. doi: 10.1016/j.cell.2019.02.051.