• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度学习方法表征启动子和增强子序列

Characterizing Promoter and Enhancer Sequences by a Deep Learning Method.

作者信息

Zeng Xin, Park Sung-Joon, Nakai Kenta

机构信息

Department of Computational Biology and Medical Science, The University of Tokyo, Kashiwa, Japan.

Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan.

出版信息

Front Genet. 2021 Jun 15;12:681259. doi: 10.3389/fgene.2021.681259. eCollection 2021.

DOI:10.3389/fgene.2021.681259
PMID:34211503
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8239401/
Abstract

Promoters and enhancers are well-known regulatory elements modulating gene expression. As confirmed by high-throughput sequencing technologies, these regulatory elements are bidirectionally transcribed. That is, promoters produce stable mRNA in the sense direction and unstable RNA in the antisense direction, while enhancers transcribe unstable RNA in both directions. Although it is thought that enhancers and promoters share a similar architecture of transcription start sites (TSSs), how the transcriptional machinery distinctly uses these genomic regions as promoters or enhancers remains unclear. To address this issue, we developed a deep learning (DL) method by utilizing a convolutional neural network (CNN) and the saliency algorithm. In comparison with other classifiers, our CNN presented higher predictive performance, suggesting the overarching importance of the high-order sequence features, captured by the CNN. Moreover, our method revealed that there are substantial sequence differences between the enhancers and promoters. Remarkably, the 20-120 bp downstream regions from the center of bidirectional TSSs seemed to contribute to the RNA stability. These regions in promoters tend to have a larger number of guanines and cytosines compared to those in enhancers, and this feature contributed to the classification of the regulatory elements. Our CNN-based method can capture the complex TSS architectures. We found that the genomic regions around TSSs for promoters and enhancers contribute to RNA stability and show GC-biased characteristics as a critical determinant for promoter TSSs.

摘要

启动子和增强子是众所周知的调节基因表达的调控元件。高通量测序技术证实,这些调控元件是双向转录的。也就是说,启动子在正义方向产生稳定的mRNA,在反义方向产生不稳定的RNA,而增强子在两个方向都转录不稳定的RNA。尽管人们认为增强子和启动子具有相似的转录起始位点(TSS)结构,但转录机制如何将这些基因组区域明确地用作启动子或增强子仍不清楚。为了解决这个问题,我们利用卷积神经网络(CNN)和显著性算法开发了一种深度学习(DL)方法。与其他分类器相比,我们的CNN表现出更高的预测性能,这表明CNN捕获的高阶序列特征具有至关重要的意义。此外,我们的方法揭示了增强子和启动子之间存在显著的序列差异。值得注意的是,双向TSS中心下游20 - 120 bp的区域似乎对RNA稳定性有贡献。与增强子中的区域相比,启动子中的这些区域往往含有更多的鸟嘌呤和胞嘧啶,这一特征有助于调控元件的分类。我们基于CNN的方法可以捕获复杂的TSS结构。我们发现,启动子和增强子TSS周围的基因组区域对RNA稳定性有贡献,并表现出GC偏向特征,这是启动子TSS的关键决定因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/d2583789c3c7/fgene-12-681259-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/6c6b47d17f6a/fgene-12-681259-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/1923b5915c06/fgene-12-681259-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/d2583789c3c7/fgene-12-681259-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/6c6b47d17f6a/fgene-12-681259-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/1923b5915c06/fgene-12-681259-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84c5/8239401/d2583789c3c7/fgene-12-681259-g003.jpg

相似文献

1
Characterizing Promoter and Enhancer Sequences by a Deep Learning Method.利用深度学习方法表征启动子和增强子序列
Front Genet. 2021 Jun 15;12:681259. doi: 10.3389/fgene.2021.681259. eCollection 2021.
2
Sequence Characteristics Distinguish Transcribed Enhancers from Promoters and Predict Their Breadth of Activity.序列特征可区分转录增强子与启动子,并预测其活性广度。
Genetics. 2019 Apr;211(4):1205-1217. doi: 10.1534/genetics.118.301895. Epub 2019 Jan 29.
3
Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers.新生RNA分析揭示了哺乳动物启动子和增强子起始区域的统一结构。
Nat Genet. 2014 Dec;46(12):1311-20. doi: 10.1038/ng.3142. Epub 2014 Nov 10.
4
Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks.使用卷积深度学习神经网络识别原核生物和真核生物启动子。
PLoS One. 2017 Feb 3;12(2):e0171410. doi: 10.1371/journal.pone.0171410. eCollection 2017.
5
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
6
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
7
The degree of enhancer or promoter activity is reflected by the levels and directionality of eRNA transcription.增强子或启动子活性的程度反映在 eRNA 转录的水平和方向性上。
Genes Dev. 2018 Jan 1;32(1):42-57. doi: 10.1101/gad.308619.117. Epub 2018 Jan 29.
8
SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models.SeqEnhDL:使用深度学习模型对细胞类型特异性增强子进行基于序列的分类
BMC Res Notes. 2021 Mar 19;14(1):104. doi: 10.1186/s13104-021-05518-7.
9
Conservation of transcription start sites within genes across a bacterial genus.一个细菌属内基因中转录起始位点的保守性。
mBio. 2014 Jul 1;5(4):e01398-14. doi: 10.1128/mBio.01398-14.
10
Genome-wide transcription start site mapping of Bradyrhizobium japonicum grown free-living or in symbiosis - a rich resource to identify new transcripts, proteins and to study gene regulation.日本慢生根瘤菌在自由生活或共生状态下的全基因组转录起始位点定位——这是鉴定新转录本、蛋白质以及研究基因调控的丰富资源。
BMC Genomics. 2016 Apr 23;17:302. doi: 10.1186/s12864-016-2602-9.

引用本文的文献

1
Leveraging massively parallel reporter assays for evolutionary questions.利用大规模平行报告基因实验进行进化问题研究。
Genome Biol. 2023 Feb 14;24(1):26. doi: 10.1186/s13059-023-02856-6.

本文引用的文献

1
Representation learning of genomic sequence motifs with convolutional neural networks.利用卷积神经网络进行基因组序列基元的表示学习。
PLoS Comput Biol. 2019 Dec 19;15(12):e1007560. doi: 10.1371/journal.pcbi.1007560. eCollection 2019 Dec.
2
Sequence Characteristics Distinguish Transcribed Enhancers from Promoters and Predict Their Breadth of Activity.序列特征可区分转录增强子与启动子,并预测其活性广度。
Genetics. 2019 Apr;211(4):1205-1217. doi: 10.1534/genetics.118.301895. Epub 2019 Jan 29.
3
Eukaryotic core promoters and the functional basis of transcription initiation.
真核生物核心启动子和转录起始的功能基础。
Nat Rev Mol Cell Biol. 2018 Oct;19(10):621-637. doi: 10.1038/s41580-018-0028-8.
4
DEEP MOTIF DASHBOARD: VISUALIZING AND UNDERSTANDING GENOMIC SEQUENCES USING DEEP NEURAL NETWORKS.深度基序仪表盘:使用深度神经网络可视化和理解基因组序列
Pac Symp Biocomput. 2017;22:254-265. doi: 10.1142/9789813207813_0025.
5
Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.巴塞特:利用深度卷积神经网络学习可及基因组的调控密码。
Genome Res. 2016 Jul;26(7):990-9. doi: 10.1101/gr.200535.115. Epub 2016 May 3.
6
DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.DanQ:一种用于量化DNA序列功能的卷积与循环相结合的深度神经网络。
Nucleic Acids Res. 2016 Jun 20;44(11):e107. doi: 10.1093/nar/gkw226. Epub 2016 Apr 15.
7
A shared architecture for promoters and enhancers.启动子和增强子的共用架构。
Nat Genet. 2014 Dec;46(12):1253-4. doi: 10.1038/ng.3152.
8
Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers.新生RNA分析揭示了哺乳动物启动子和增强子起始区域的统一结构。
Nat Genet. 2014 Dec;46(12):1311-20. doi: 10.1038/ng.3142. Epub 2014 Nov 10.
9
An atlas of active enhancers across human cell types and tissues.人类细胞类型和组织中活跃增强子图谱。
Nature. 2014 Mar 27;507(7493):455-461. doi: 10.1038/nature12787.
10
Divergent transcription: a driving force for new gene origination?可变转录:新基因起源的驱动力?
Cell. 2013 Nov 21;155(5):990-6. doi: 10.1016/j.cell.2013.10.048.