• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用深度卷积神经网络评估DNA序列和表观遗传修饰对基因表达的相对重要性。

Assessing comparative importance of DNA sequence and epigenetic modifications on gene expression using a deep convolutional neural network.

作者信息

Gao Shang, Rehman Jalees, Dai Yang

机构信息

Department of Biomedical Engineering, University of Illinois at Chicago, Chicago, IL, USA.

Department of Medicine, Division of Cardiology, University of Illinois at Chicago, Chicago, IL, USA.

出版信息

Comput Struct Biotechnol J. 2022 Jul 13;20:3814-3823. doi: 10.1016/j.csbj.2022.07.014. eCollection 2022.

DOI:10.1016/j.csbj.2022.07.014
PMID:35891778
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9307602/
Abstract

Gene expression is regulated at both transcriptional and post-transcriptional levels. DNA sequence and epigenetic modifications are key factors which regulate gene transcription. Understanding their complex interactions and their respective contributions to gene expression regulation remains a challenge in biological studies. We have developed iSEGnet, a framework of deep convolutional neural network to predict mRNA abundance using the information on DNA sequences as well as epigenetic modifications within genes and their -regulatory regions. We demonstrate that our framework outperforms other machine learning models in terms of predicting mRNA abundance using transcriptional and epigenetic profiles from six distinct cell lines/types chosen from the ENCODE. The analysis from the learned models also reveals that specific regions around promotors and transcription termination sites are most important for gene expression regulation. Using the method of Integrated Gradients, we identify narrow segments in these regions which are most likely to impact gene expression for a specific epigenetic modification. We further show that these identified segments are enriched in known active regulatory regions by comparing the transcription factor binding sites obtained via ChIP-seq. Moreover, we demonstrate how iSEGnet can uncover potential transcription factors that have regulatory functions in cancer using two cancer multi-omics data.

摘要

基因表达在转录和转录后水平均受到调控。DNA序列和表观遗传修饰是调控基因转录的关键因素。了解它们之间的复杂相互作用及其对基因表达调控的各自贡献仍是生物学研究中的一项挑战。我们开发了iSEGnet,这是一个深度卷积神经网络框架,用于利用基因及其调控区域内的DNA序列信息以及表观遗传修饰来预测mRNA丰度。我们证明,在使用从ENCODE中选取的六种不同细胞系/类型的转录和表观遗传谱预测mRNA丰度方面,我们的框架优于其他机器学习模型。从学习模型进行的分析还表明,启动子和转录终止位点周围的特定区域对基因表达调控最为重要。使用综合梯度法,我们在这些区域中识别出最有可能影响特定表观遗传修饰的基因表达的狭窄片段。通过比较通过ChIP-seq获得的转录因子结合位点,我们进一步表明,这些识别出的片段在已知的活性调控区域中富集。此外,我们展示了iSEGnet如何利用两个癌症多组学数据揭示在癌症中具有调控功能的潜在转录因子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/27942030fd84/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/34eb6e066d1b/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/463a206b1e19/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/a9b2e6cc559b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/1604c718582a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/fc810aa11d3e/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/be154c32e423/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/27942030fd84/gr7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/34eb6e066d1b/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/463a206b1e19/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/a9b2e6cc559b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/1604c718582a/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/fc810aa11d3e/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/be154c32e423/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7619/9307602/27942030fd84/gr7.jpg

相似文献

1
Assessing comparative importance of DNA sequence and epigenetic modifications on gene expression using a deep convolutional neural network.使用深度卷积神经网络评估DNA序列和表观遗传修饰对基因表达的相对重要性。
Comput Struct Biotechnol J. 2022 Jul 13;20:3814-3823. doi: 10.1016/j.csbj.2022.07.014. eCollection 2022.
2
DeepD2V: A Novel Deep Learning-Based Framework for Predicting Transcription Factor Binding Sites from Combined DNA Sequence.DeepD2V:一种基于深度学习的新型框架,用于从组合 DNA 序列预测转录因子结合位点。
Int J Mol Sci. 2021 May 24;22(11):5521. doi: 10.3390/ijms22115521.
3
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
4
Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques.通过深度学习和贝叶斯优化技术提高活性顺式调控区域的组织特异性预测。
BMC Bioinformatics. 2022 Dec 12;23(Suppl 2):154. doi: 10.1186/s12859-022-04582-5.
5
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach.基于新型混合深度学习跨域知识整合方法的RNA-蛋白质结合基序挖掘
BMC Bioinformatics. 2017 Feb 28;18(1):136. doi: 10.1186/s12859-017-1561-8.
6
An Integrative Framework for Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning.一种用于结合序列和表观基因组数据以利用深度学习预测转录因子结合位点的整合框架。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):355-364. doi: 10.1109/TCBB.2019.2901789. Epub 2021 Feb 3.
7
DeepChrome: deep-learning for predicting gene expression from histone modifications.深度铬:用于从组蛋白修饰预测基因表达的深度学习
Bioinformatics. 2016 Sep 1;32(17):i639-i648. doi: 10.1093/bioinformatics/btw427.
8
Prediction of Gene Activity in Early B Cell Development Based on an Integrative Multi-Omics Analysis.基于整合多组学分析的早期B细胞发育中基因活性预测
J Proteomics Bioinform. 2014 Feb 17;7. doi: 10.4172/jpb.1000302.
9
TBCA: Prediction of transcription factor binding sites using a deep neural network with lightweight attention mechanism.TBCA:使用具有轻量级注意力机制的深度神经网络预测转录因子结合位点
IEEE J Biomed Health Inform. 2024 Jan 18;PP. doi: 10.1109/JBHI.2024.3355758.
10
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.基于监督深度学习方法的全基因组顺式调控区预测。
BMC Bioinformatics. 2018 May 31;19(1):202. doi: 10.1186/s12859-018-2187-1.

引用本文的文献

1
TExCNN: Leveraging Pre-Trained Models to Predict Gene Expression from Genomic Sequences.TExCNN:利用预训练模型从基因组序列预测基因表达
Genes (Basel). 2024 Dec 12;15(12):1593. doi: 10.3390/genes15121593.
2
Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights.在解析转录调控网络的计算和实验方法方面的进展:理解顺式调控元件的作用至关重要,最近利用 MPRAs、STARR-seq、CRISPR-Cas9 和机器学习的研究提供了有价值的见解。
Bioessays. 2024 Jul;46(7):e2300210. doi: 10.1002/bies.202300210. Epub 2024 May 8.

本文引用的文献

1
Detection of transcription factors binding to methylated DNA by deep recurrent neural network.通过深度递归神经网络检测与甲基化 DNA 结合的转录因子。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab533.
2
JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.JASPAR 2022:转录因子结合谱开放获取数据库的第 9 个版本。
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173. doi: 10.1093/nar/gkab1113.
3
Effective gene expression prediction from sequence by integrating long-range interactions.
通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
4
Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr
Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.
5
Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure.深度学习表明,基因表达是由共同进化的相互作用基因调控结构的所有部分编码的。
Nat Commun. 2020 Dec 1;11(1):6141. doi: 10.1038/s41467-020-19921-4.
6
KEGG: integrating viruses and cellular organisms.KEGG:整合病毒和细胞生物。
Nucleic Acids Res. 2021 Jan 8;49(D1):D545-D551. doi: 10.1093/nar/gkaa970.
7
Multi-faceted epigenetic dysregulation of gene expression promotes esophageal squamous cell carcinoma.多种表观遗传基因表达调控失调促进食管鳞状细胞癌。
Nat Commun. 2020 Jul 22;11(1):3675. doi: 10.1038/s41467-020-17227-z.
8
Cross-species regulatory sequence activity prediction.跨物种调控序列活性预测。
PLoS Comput Biol. 2020 Jul 20;16(7):e1008050. doi: 10.1371/journal.pcbi.1008050. eCollection 2020 Jul.
9
Deep learning models in genomics; are we there yet?基因组学中的深度学习模型;我们做到了吗?
Comput Struct Biotechnol J. 2020 Jun 17;18:1466-1473. doi: 10.1016/j.csbj.2020.06.017. eCollection 2020.
10
Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study.增强型集成梯度:以拼接码为例,提高深度学习模型的可解释性。
Genome Biol. 2020 Jun 19;21(1):149. doi: 10.1186/s13059-020-02055-7.