• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DeepG4:一种深度学习方法,用于预测细胞类型特异性的活性 G-四链体区域。

DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.

机构信息

Molecular, Cellular and Developmental biology department (MCD), Centre de Biologie Intégrative (CBI), University of Toulouse, CNRS, UPS, Toulouse, France.

Centre de Recherches en Cancérologie de Toulouse (CRCT), INSERM U1037, Toulouse, France.

出版信息

PLoS Comput Biol. 2021 Aug 12;17(8):e1009308. doi: 10.1371/journal.pcbi.1009308. eCollection 2021 Aug.

DOI:10.1371/journal.pcbi.1009308
PMID:34383754
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8384162/
Abstract

DNA is a complex molecule carrying the instructions an organism needs to develop, live and reproduce. In 1953, Watson and Crick discovered that DNA is composed of two chains forming a double-helix. Later on, other structures of DNA were discovered and shown to play important roles in the cell, in particular G-quadruplex (G4). Following genome sequencing, several bioinformatic algorithms were developed to map G4s in vitro based on a canonical sequence motif, G-richness and G-skewness or alternatively sequence features including k-mers, and more recently machine/deep learning. Recently, new sequencing techniques were developed to map G4s in vitro (G4-seq) and G4s in vivo (G4 ChIP-seq) at few hundred base resolution. Here, we propose a novel convolutional neural network (DeepG4) to map cell-type specific active G4 regions (e.g. regions within which G4s form both in vitro and in vivo). DeepG4 is very accurate to predict active G4 regions in different cell types. Moreover, DeepG4 identifies key DNA motifs that are predictive of G4 region activity. We found that such motifs do not follow a very flexible sequence pattern as current algorithms seek for. Instead, active G4 regions are determined by numerous specific motifs. Moreover, among those motifs, we identified known transcription factors (TFs) which could play important roles in G4 activity by contributing either directly to G4 structures themselves or indirectly by participating in G4 formation in the vicinity. In addition, we used DeepG4 to predict active G4 regions in a large number of tissues and cancers, thereby providing a comprehensive resource for researchers. Availability: https://github.com/morphos30/DeepG4.

摘要

DNA 是一种携带生物体发育、生存和繁殖所需指令的复杂分子。1953 年,沃森和克里克发现 DNA 由两条链组成,形成双螺旋结构。后来,又发现了其他结构的 DNA,并证明它们在细胞中发挥着重要作用,特别是 G-四链体 (G4)。在基因组测序之后,开发了几种生物信息学算法,根据典型序列基序、G 丰富度和 G 偏度或替代序列特征(包括 k- mers 以及最近的机器学习/深度学习)在体外绘制 G4。最近,开发了新的测序技术来体外(G4-seq)和体内(G4-ChIP-seq)以数百个碱基的分辨率绘制 G4。在这里,我们提出了一种新的卷积神经网络(DeepG4)来绘制细胞类型特异性的活性 G4 区域(例如,体外和体内都形成 G4 的区域)。DeepG4 非常准确地预测不同细胞类型中的活性 G4 区域。此外,DeepG4 确定了预测 G4 区域活性的关键 DNA 基序。我们发现,这些基序不遵循当前算法所寻求的非常灵活的序列模式。相反,活性 G4 区域由许多特定的基序决定。此外,在这些基序中,我们确定了已知的转录因子(TFs),它们可以通过直接参与 G4 结构本身或通过参与附近的 G4 形成间接参与 G4 活性,从而发挥重要作用。此外,我们使用 DeepG4 预测了大量组织和癌症中的活性 G4 区域,从而为研究人员提供了一个全面的资源。可用性:https://github.com/morphos30/DeepG4。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/ce60b89a2e4f/pcbi.1009308.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/00b967913638/pcbi.1009308.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/21f19d39fc28/pcbi.1009308.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/f22853cc4e78/pcbi.1009308.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/80529e2e4e7b/pcbi.1009308.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/ce60b89a2e4f/pcbi.1009308.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/00b967913638/pcbi.1009308.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/21f19d39fc28/pcbi.1009308.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/f22853cc4e78/pcbi.1009308.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/80529e2e4e7b/pcbi.1009308.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b2/8384162/ce60b89a2e4f/pcbi.1009308.g005.jpg

相似文献

1
DeepG4: A deep learning approach to predict cell-type specific active G-quadruplex regions.DeepG4:一种深度学习方法,用于预测细胞类型特异性的活性 G-四链体区域。
PLoS Comput Biol. 2021 Aug 12;17(8):e1009308. doi: 10.1371/journal.pcbi.1009308. eCollection 2021 Aug.
2
Characterization of DNA G-Quadruplex Structures in Human Immunoglobulin Heavy Variable (IGHV) Genes.鉴定人免疫球蛋白重链可变区(IGHV)基因中的 DNA G-四链体结构。
Front Immunol. 2021 May 10;12:671944. doi: 10.3389/fimmu.2021.671944. eCollection 2021.
3
Prediction of strand-specific and cell-type-specific G-quadruplexes based on high-resolution CUT&Tag data.基于高分辨率 CUT&Tag 数据预测链特异性和细胞类型特异性 G-四链体。
Brief Funct Genomics. 2024 May 15;23(3):265-275. doi: 10.1093/bfgp/elad024.
4
An overview on nucleic-acid G-quadruplex prediction: from rule-based methods to deep neural networks.核酸 G-四链体预测概述:从基于规则的方法到深度神经网络。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad252.
5
G-Quadruplex Structures Are Key Modulators of Somatic Structural Variants in Cancers.G-四链体结构是癌症中体细胞结构变异的关键调节剂。
Cancer Res. 2023 Apr 14;83(8):1234-1248. doi: 10.1158/0008-5472.CAN-22-3089.
6
G-quadruplex landscape and its regulation revealed by a new antibody capture method.新型抗体捕获方法揭示的 G-四链体景观及其调控
Oncotarget. 2024 Mar 14;15:175-198. doi: 10.18632/oncotarget.28564.
7
G4Boost: a machine learning-based tool for quadruplex identification and stability prediction.G4 增强:一种基于机器学习的四联体识别和稳定性预测工具。
BMC Bioinformatics. 2022 Jun 18;23(1):240. doi: 10.1186/s12859-022-04782-z.
8
Epigenomic Features and Potential Functions of K and Na Favorable DNA G-Quadruplexes in Rice.水稻中 K 和 Na 有利 DNA G-四链体的表观基因组特征和潜在功能。
Int J Mol Sci. 2022 Jul 29;23(15):8404. doi: 10.3390/ijms23158404.
9
Machine learning model for sequence-driven DNA G-quadruplex formation.用于序列驱动的 DNA G-四链体形成的机器学习模型。
Sci Rep. 2017 Nov 6;7(1):14535. doi: 10.1038/s41598-017-14017-4.
10
G-quadruplex (G4) motifs in the maize (Zea mays L.) genome are enriched at specific locations in thousands of genes coupled to energy status, hypoxia, low sugar, and nutrient deprivation.玉米(Zea mays L.)基因组中的 G-四链体(G4)基序在与能量状态、缺氧、低糖和营养缺乏相关的数千个基因的特定位置富集。
J Genet Genomics. 2014 Dec 20;41(12):627-47. doi: 10.1016/j.jgg.2014.10.004. Epub 2014 Nov 4.

引用本文的文献

1
Mammalian conservation of endogenous G-quadruplex reveals their associations with complex traits.哺乳动物内源性G-四链体的保守性揭示了它们与复杂性状的关联。
Genome Biol. 2025 Sep 1;26(1):262. doi: 10.1186/s13059-025-03750-z.
2
Precise detection of G-quadruplexs in living systems: principles, applications, and perspectives.活体细胞中G-四链体的精确检测:原理、应用及展望
Chem Sci. 2025 May 16. doi: 10.1039/d5sc00918a.
3
Multi-objective computational optimization of human 5' UTR sequences.人类5'非翻译区序列的多目标计算优化

本文引用的文献

1
G-quadruplexes are transcription factor binding hubs in human chromatin.G-四链体是人类染色质中转录因子的结合中心。
Genome Biol. 2021 Apr 23;22(1):117. doi: 10.1186/s13059-021-02324-z.
2
PENGUINN: Precise Exploration of Nuclear G-Quadruplexes Using Interpretable Neural Networks.PENGUINN:使用可解释神经网络对核G-四链体进行精确探索。
Front Genet. 2020 Oct 27;11:568546. doi: 10.3389/fgene.2020.568546. eCollection 2020.
3
Detection of genomic G-quadruplexes in living cells using a small artificial protein.利用一种小型人工蛋白质在活细胞中检测基因组 G-四链体。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf225.
4
Genomic 8-oxoguanine modulates gene transcription independent of its repair by DNA glycosylases OGG1 and MUTYH.基因组8-氧代鸟嘌呤独立于DNA糖基化酶OGG1和MUTYH对其进行的修复来调节基因转录。
Redox Biol. 2025 Feb;79:103461. doi: 10.1016/j.redox.2024.103461. Epub 2024 Dec 5.
5
Machine learning-based prediction of DNA G-quadruplex folding topology with G4ShapePredictor.基于机器学习的 DNA G-四链体折叠拓扑结构预测软件 G4ShapePredictor。
Sci Rep. 2024 Oct 16;14(1):24238. doi: 10.1038/s41598-024-74826-2.
6
Prediction of DNA i-motifs via machine learning.通过机器学习预测 DNA i- 发夹结构。
Nucleic Acids Res. 2024 Mar 21;52(5):2188-2197. doi: 10.1093/nar/gkae092.
7
Toward a Better Understanding of G4 Evolution in the 3 Living Kingdoms.迈向对三个生物界中G4进化的更好理解。
Evol Bioinform Online. 2023 Dec 1;19:11769343231212075. doi: 10.1177/11769343231212075. eCollection 2023.
8
EndoQuad: a comprehensive genome-wide experimentally validated endogenous G-quadruplex database.EndoQuad:一个全面的全基因组实验验证的内源性 G-四链体数据库。
Nucleic Acids Res. 2024 Jan 5;52(D1):D72-D80. doi: 10.1093/nar/gkad966.
9
Prediction of G4 formation in live cells with epigenetic data: a deep learning approach.利用表观遗传数据预测活细胞中的G4形成:一种深度学习方法。
NAR Genom Bioinform. 2023 Aug 24;5(3):lqad071. doi: 10.1093/nargab/lqad071. eCollection 2023 Sep.
10
Deep statistical modelling of nanopore sequencing translocation times reveals latent non-B DNA structures.对纳米孔测序迁移时间进行深度统计建模揭示了潜在的非 B DNA 结构。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i242-i251. doi: 10.1093/bioinformatics/btad220.
Nucleic Acids Res. 2020 Nov 18;48(20):11706-11720. doi: 10.1093/nar/gkaa841.
4
The Structure and Function of DNA G-Quadruplexes.DNA G-四链体的结构与功能
Trends Chem. 2020 Feb;2(2):123-136. doi: 10.1016/j.trechm.2019.07.002.
5
How bioinformatics resources work with G4 RNAs.生物信息学资源如何与 G4 RNA 相互作用。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa201.
6
G-Quadruplexes at Telomeres: Friend or Foe?端粒处的 G-四链体:是敌是友?
Molecules. 2020 Aug 13;25(16):3686. doi: 10.3390/molecules25163686.
7
Landscape of G-quadruplex DNA structural regions in breast cancer.乳腺癌中 G-四链体 DNA 结构区域的全景图。
Nat Genet. 2020 Sep;52(9):878-883. doi: 10.1038/s41588-020-0672-8. Epub 2020 Aug 3.
8
The regulation and functions of DNA and RNA G-quadruplexes.DNA 和 RNA G-四链体的调控和功能。
Nat Rev Mol Cell Biol. 2020 Aug;21(8):459-474. doi: 10.1038/s41580-020-0236-x. Epub 2020 Apr 20.
9
A guide to computational methods for G-quadruplex prediction.G-四链体预测的计算方法指南。
Nucleic Acids Res. 2020 Jan 10;48(1):1-15. doi: 10.1093/nar/gkz1097.
10
Perspectives for Applying G-Quadruplex Structures in Neurobiology and Neuropharmacology.应用 G-四链体结构于神经生物学和神经药理学的展望。
Int J Mol Sci. 2019 Jun 13;20(12):2884. doi: 10.3390/ijms20122884.