• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

茎环和四链体对癌症断点形成的组织特异性影响。

Tissue-specific impact of stem-loops and quadruplexes on cancer breakpoints formation.

机构信息

Faculty of Computer Science, National Research University Higher School of Economics, 125319, Moscow, 3 Kochnovsky Proezd, Russia.

出版信息

BMC Cancer. 2019 May 10;19(1):434. doi: 10.1186/s12885-019-5653-x.

DOI:10.1186/s12885-019-5653-x
PMID:31077166
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6511154/
Abstract

BACKGROUND

Chromosomal rearrangements are the typical phenomena in cancer genomes causing gene disruptions and fusions, corruption of regulatory elements, damage to chromosome integrity. Among the factors contributing to genomic instability are non-B DNA structures with stem-loops and quadruplexes being the most prevalent. We aimed at investigating the impact of specifically these two classes of non-B DNA structures on cancer breakpoint hotspots using machine learning approach.

METHODS

We developed procedure for machine learning model building and evaluation as the considered data are extremely imbalanced and it was required to get a reliable estimate of the prediction power. We built logistic regression models predicting cancer breakpoint hotspots based on the densities of stem-loops and quadruplexes, jointly and separately. We also tested Random Forest models varying different resampling schemes (leave-one-out cross validation, train-test split, 3-fold cross-validation) and class balancing techniques (oversampling, stratification, synthetic minority oversampling).

RESULTS

We performed analysis of 487,425 breakpoints from 2234 samples covering 10 cancer types available from the International Cancer Genome Consortium. We showed that distribution of breakpoint hotspots in different types of cancer are not correlated, confirming the heterogeneous nature of cancer. It appeared that stem-loop-based model best explains the blood, brain, liver, and prostate cancer breakpoint hotspot profiles while quadruplex-based model has higher performance for the bone, breast, ovary, pancreatic, and skin cancer. For the overall cancer profile and uterus cancer the joint model shows the highest performance. For particular datasets the constructed models reach high predictive power using just one predictor, and in the majority of the cases, the model built on both predictors does not increase the model performance.

CONCLUSION

Despite the heterogeneity in breakpoint hotspots' distribution across different cancer types, our results demonstrate an association between cancer breakpoint hotspots and stem-loops and quadruplexes. Approximately for half of the cancer types stem-loops are the most influential factors while for the others these are quadruplexes. This fact reflects the differences in regulatory potential of stem-loops and quadruplexes at the tissue-specific level, which yet to be discovered at the genome-wide scale. The performed analysis demonstrates that influence of stem-loops and quadruplexes on breakpoint hotspots formation is tissue-specific.

摘要

背景

染色体重排是癌症基因组中的典型现象,导致基因断裂和融合,调控元件失活,染色体完整性受损。导致基因组不稳定性的因素包括具有茎环和四链体的非 B 型 DNA 结构,其中最常见的是茎环和四链体。我们旨在使用机器学习方法研究这两类非 B 型 DNA 结构对癌症断裂点热点的影响。

方法

我们开发了机器学习模型构建和评估的程序,因为所考虑的数据极不平衡,需要可靠估计预测能力。我们构建了基于茎环和四链体密度的预测癌症断裂点热点的逻辑回归模型,分别和联合进行预测。我们还测试了随机森林模型,改变了不同的重采样方案(留一交叉验证、训练-测试分割、3 倍交叉验证)和类别平衡技术(过采样、分层、合成少数过采样)。

结果

我们分析了来自国际癌症基因组联盟的 10 种癌症类型的 2234 个样本中 487425 个断裂点。我们表明,不同类型癌症中断裂点热点的分布没有相关性,证实了癌症的异质性。似乎基于茎环的模型能够最好地解释血液、大脑、肝脏和前列腺癌的断裂点热点分布,而基于四链体的模型在骨骼、乳腺、卵巢、胰腺和皮肤癌方面具有更高的性能。对于整体癌症图谱和子宫癌,联合模型显示出最高的性能。对于特定的数据集,构建的模型仅使用一个预测器就可以达到很高的预测能力,并且在大多数情况下,使用两个预测器构建的模型不会提高模型性能。

结论

尽管不同癌症类型的断裂点热点分布存在异质性,但我们的结果表明癌症断裂点热点与茎环和四链体之间存在关联。对于大约一半的癌症类型,茎环是最具影响力的因素,而对于其他类型,这些是四链体。这一事实反映了茎环和四链体在组织特异性水平上的调节潜力的差异,这一差异尚未在全基因组范围内发现。所进行的分析表明,茎环和四链体对断裂点热点形成的影响是组织特异性的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/f7d05a0b6a8b/12885_2019_5653_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/6b2f55c521dc/12885_2019_5653_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/67162e9a3613/12885_2019_5653_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/6dabf1d910c5/12885_2019_5653_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/f686813b6ae6/12885_2019_5653_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/9e87790769c7/12885_2019_5653_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/2d9d7887019a/12885_2019_5653_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/f7d05a0b6a8b/12885_2019_5653_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/6b2f55c521dc/12885_2019_5653_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/67162e9a3613/12885_2019_5653_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/6dabf1d910c5/12885_2019_5653_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/f686813b6ae6/12885_2019_5653_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/9e87790769c7/12885_2019_5653_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/2d9d7887019a/12885_2019_5653_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d0fc/6511154/f7d05a0b6a8b/12885_2019_5653_Fig7_HTML.jpg

相似文献

1
Tissue-specific impact of stem-loops and quadruplexes on cancer breakpoints formation.茎环和四链体对癌症断点形成的组织特异性影响。
BMC Cancer. 2019 May 10;19(1):434. doi: 10.1186/s12885-019-5653-x.
2
Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements.全面分析癌症断点揭示了遗传和表观遗传对癌症基因组重排的贡献特征。
PLoS Comput Biol. 2021 Mar 1;17(3):e1008749. doi: 10.1371/journal.pcbi.1008749. eCollection 2021 Mar.
3
Randomness in Cancer Breakpoint Prediction.癌症断点预测中的随机性
J Comput Biol. 2021 Jul;28(7):716-731. doi: 10.1089/cmb.2020.0551. Epub 2021 Jun 15.
4
R-loops and regulatory changes in chronologically ageing fission yeast cells drive non-random patterns of genome rearrangements.R 环和时序老化的裂殖酵母细胞中的调控变化驱动基因组重排的非随机模式。
PLoS Genet. 2021 Aug 31;17(8):e1009784. doi: 10.1371/journal.pgen.1009784. eCollection 2021 Aug.
5
G-quadruplexes may determine the landscape of recombination in HSV-1.G-四链体可能决定 HSV-1 中的重组景观。
BMC Genomics. 2019 May 16;20(1):382. doi: 10.1186/s12864-019-5731-0.
6
Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences.癌症基因组中的易位和缺失断点与潜在的非B型DNA形成序列相关。
Nucleic Acids Res. 2016 Jul 8;44(12):5673-88. doi: 10.1093/nar/gkw261. Epub 2016 Apr 15.
7
Comparative analysis of somatic copy-number alterations across different human cancer types reveals two distinct classes of breakpoint hotspots.对不同人类癌症类型的体细胞拷贝数改变进行比较分析,揭示了两个不同类别的断点热点。
Hum Mol Genet. 2012 Nov 15;21(22):4957-65. doi: 10.1093/hmg/dds340. Epub 2012 Aug 16.
8
Potential G-quadruplex formation at breakpoint regions of chromosomal translocations in cancer may explain their fragility.癌症中染色体易位断裂点区域的潜在 G-四链体形成可能解释了它们的脆弱性。
Genomics. 2012 Aug;100(2):72-80. doi: 10.1016/j.ygeno.2012.05.008. Epub 2012 May 30.
9
Toward predictive R-loop computational biology: genome-scale prediction of R-loops reveals their association with complex promoter structures, G-quadruplexes and transcriptionally active enhancers.朝着预测性 R 环计算生物学迈进:对 R 环的全基因组预测揭示了它们与复杂启动子结构、G-四联体和转录活跃增强子的关联。
Nucleic Acids Res. 2018 Sep 6;46(15):7566-7585. doi: 10.1093/nar/gky554.
10
Translocation and gross deletion breakpoints in human inherited disease and cancer II: Potential involvement of repetitive sequence elements in secondary structure formation between DNA ends.人类遗传疾病和癌症中的易位与大片段缺失断点II:重复序列元件在DNA末端间二级结构形成中的潜在作用
Hum Mutat. 2003 Sep;22(3):245-51. doi: 10.1002/humu.10253.

引用本文的文献

1
Dynamic alternative DNA structures in biology and disease.生物学和疾病中的动态替代性DNA结构。
Nat Rev Genet. 2023 Apr;24(4):211-234. doi: 10.1038/s41576-022-00539-9. Epub 2022 Oct 31.
2
Single-molecule imaging reveals replication fork coupled formation of G-quadruplex structures hinders local replication stress signaling.单分子成像揭示了复制叉偶联形成 G-四链体结构阻碍局部复制应激信号转导。
Nat Commun. 2021 May 5;12(1):2525. doi: 10.1038/s41467-021-22830-9.
3
Comprehensive analysis of cancer breakpoints reveals signatures of genetic and epigenetic contribution to cancer genome rearrangements.

本文引用的文献

1
Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis.非 B DNA 模体产生的非规范二级结构是诱变的决定因素。
Genome Res. 2018 Sep;28(9):1264-1271. doi: 10.1101/gr.231688.117. Epub 2018 Aug 13.
2
Predicting double-strand DNA breaks using epigenome marks or DNA at kilobase resolution.使用表观基因组标记或千碱基分辨率的 DNA 预测双链 DNA 断裂。
Genome Biol. 2018 Mar 15;19(1):34. doi: 10.1186/s13059-018-1411-7.
3
Whole genome sequencing analysis for cancer genomics and precision medicine.用于癌症基因组学和精准医学的全基因组测序分析。
全面分析癌症断点揭示了遗传和表观遗传对癌症基因组重排的贡献特征。
PLoS Comput Biol. 2021 Mar 1;17(3):e1008749. doi: 10.1371/journal.pcbi.1008749. eCollection 2021 Mar.
4
KRAS-retroviral fusion transcripts and gene amplification in arsenic-transformed, human prostate CAsE-PE cancer cells.砷转化的人前列腺癌CAsE-PE癌细胞中的KRAS逆转录病毒融合转录本和基因扩增
Toxicol Appl Pharmacol. 2020 Apr 25;397:115017. doi: 10.1016/j.taap.2020.115017.
5
A 'light-up' intercalator displacement assay for detection of triplex DNA stabilizers.一种用于检测三链 DNA 稳定剂的“点亮”嵌入剂置换分析方法。
Chem Commun (Camb). 2020 Feb 13;56(13):1996-1999. doi: 10.1039/c9cc08817b.
Cancer Sci. 2018 Mar;109(3):513-522. doi: 10.1111/cas.13505. Epub 2018 Feb 26.
4
Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours.Hi-C作为一种用于精确检测和表征人类肿瘤中染色体重排和拷贝数变异的工具。
Genome Biol. 2017 Jun 27;18(1):125. doi: 10.1186/s13059-017-1253-8.
5
Permanganate/S1 Nuclease Footprinting Reveals Non-B DNA Structures with Regulatory Potential across a Mammalian Genome.高锰酸盐/S1 核酸酶足迹法揭示了哺乳动物基因组中具有调控潜力的非 B 型 DNA 结构。
Cell Syst. 2017 Mar 22;4(3):344-356.e7. doi: 10.1016/j.cels.2017.01.013. Epub 2017 Feb 22.
6
DSBCapture: in situ capture and sequencing of DNA breaks.DSB捕获:DNA断裂的原位捕获与测序
Nat Methods. 2016 Oct;13(10):855-7. doi: 10.1038/nmeth.3960. Epub 2016 Aug 15.
7
Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences.癌症基因组中的易位和缺失断点与潜在的非B型DNA形成序列相关。
Nucleic Acids Res. 2016 Jul 8;44(12):5673-88. doi: 10.1093/nar/gkw261. Epub 2016 Apr 15.
8
Snaps and mends: DNA breaks and chromosomal translocations.断裂与修复:DNA断裂和染色体易位
FEBS J. 2015 Jul;282(14):2627-45. doi: 10.1111/febs.13311. Epub 2015 May 19.
9
Cancer whole-genome sequencing: present and future.癌症全基因组测序:现状与未来。
Oncogene. 2015 Dec 3;34(49):5943-50. doi: 10.1038/onc.2015.90. Epub 2015 Mar 30.
10
Cell-of-origin chromatin organization shapes the mutational landscape of cancer.起源细胞染色质组织塑造了癌症的突变景观。
Nature. 2015 Feb 19;518(7539):360-364. doi: 10.1038/nature14221.