• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度CAPE:用于准确预测增强子的深度卷积神经网络

DeepCAPE: A Deep Convolutional Neural Network for the Accurate Prediction of Enhancers.

作者信息

Chen Shengquan, Gan Mingxin, Lv Hairong, Jiang Rui

机构信息

Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China.

Department of Management Science and Engineering, School of Economics and Management, University of Science and Technology Beijing, Beijing 100083, China.

出版信息

Genomics Proteomics Bioinformatics. 2021 Aug;19(4):565-577. doi: 10.1016/j.gpb.2019.04.006. Epub 2021 Feb 11.

DOI:10.1016/j.gpb.2019.04.006
PMID:33581335
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9040020/
Abstract

The establishment of a landscape of enhancers across human cells is crucial to deciphering the mechanism of gene regulation, cell differentiation, and disease development. High-throughput experimental approaches, which contain successfully reported enhancers in typical cell lines, are still too costly and time-consuming to perform systematic identification of enhancers specific to different cell lines. Existing computational methods, capable of predicting regulatory elements purely relying on DNA sequences, lack the power of cell line-specific screening. Recent studies have suggested that chromatin accessibility of a DNA segment is closely related to its potential function in regulation, and thus may provide useful information in identifying regulatory elements. Motivated by the aforementioned understanding, we integrate DNA sequences and chromatin accessibility data to accurately predict enhancers in a cell line-specific manner. We proposed DeepCAPE, a deep convolutional neural network to predict enhancers via the integration of DNA sequences and DNase-seq data. Benefitting from the well-designed feature extraction mechanism and skip connection strategy, our model not only consistently outperforms existing methods in the imbalanced classification of cell line-specific enhancers against background sequences, but also has the ability to self-adapt to different sizes of datasets. Besides, with the adoption of auto-encoder, our model is capable of making cross-cell line predictions. We further visualize kernels of the first convolutional layer and show the match of identified sequence signatures and known motifs. We finally demonstrate the potential ability of our model to explain functional implications of putative disease-associated genetic variants and discriminate disease-related enhancers. The source code and detailed tutorial of DeepCAPE are freely available at https://github.com/ShengquanChen/DeepCAPE.

摘要

绘制人类细胞中增强子图谱对于解读基因调控、细胞分化和疾病发展机制至关重要。高通量实验方法虽已成功报道了典型细胞系中的增强子,但进行不同细胞系特异性增强子的系统鉴定仍成本高昂且耗时。现有的计算方法仅依靠DNA序列预测调控元件,缺乏细胞系特异性筛选能力。近期研究表明,DNA片段的染色质可及性与其潜在调控功能密切相关,可为鉴定调控元件提供有用信息。基于上述认识,我们整合DNA序列和染色质可及性数据,以细胞系特异性方式准确预测增强子。我们提出了DeepCAPE,一种通过整合DNA序列和DNase-seq数据来预测增强子的深度卷积神经网络。受益于精心设计的特征提取机制和跳跃连接策略,我们的模型不仅在细胞系特异性增强子与背景序列的不平衡分类中始终优于现有方法,还能够自适应不同规模的数据集。此外,通过采用自动编码器,我们的模型能够进行跨细胞系预测。我们进一步可视化了第一层卷积层的内核,并展示了识别出的序列特征与已知基序的匹配情况。我们最终证明了我们的模型在解释假定的疾病相关遗传变异功能影响和区分疾病相关增强子方面的潜在能力。DeepCAPE的源代码和详细教程可在https://github.com/ShengquanChen/DeepCAPE上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/9c6398e0ca3e/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/14d8829492c5/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/3db64e88ced4/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/fa61c7fbd08d/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/66d7613e2231/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/2e5aefddb894/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/d8aaf7ace070/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/b512b7d3001c/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/7d1389b492c3/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/9c6398e0ca3e/fx3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/14d8829492c5/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/3db64e88ced4/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/fa61c7fbd08d/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/66d7613e2231/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/2e5aefddb894/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/d8aaf7ace070/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/b512b7d3001c/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/7d1389b492c3/fx2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0470/9040020/9c6398e0ca3e/fx3.jpg

相似文献

1
DeepCAPE: A Deep Convolutional Neural Network for the Accurate Prediction of Enhancers.深度CAPE:用于准确预测增强子的深度卷积神经网络
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):565-577. doi: 10.1016/j.gpb.2019.04.006. Epub 2021 Feb 11.
2
Chromatin accessibility prediction via a hybrid deep convolutional neural network.基于混合深度卷积神经网络的染色质可及性预测。
Bioinformatics. 2018 Mar 1;34(5):732-738. doi: 10.1093/bioinformatics/btx679.
3
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
4
Predicting enhancers with deep convolutional neural networks.使用深度卷积神经网络预测增强子。
BMC Bioinformatics. 2017 Dec 1;18(Suppl 13):478. doi: 10.1186/s12859-017-1878-3.
5
DeepSE: Detecting super-enhancers among typical enhancers using only sequence feature embeddings.DeepSE:仅使用序列特征嵌入来检测典型增强子中的超级增强子。
Genomics. 2021 Nov;113(6):4052-4060. doi: 10.1016/j.ygeno.2021.10.007. Epub 2021 Oct 16.
6
An Efficient Lightweight Hybrid Model with Attention Mechanism for Enhancer Sequence Recognition.一种高效的轻量级混合模型,具有注意力机制,用于增强子序列识别。
Biomolecules. 2022 Dec 29;13(1):70. doi: 10.3390/biom13010070.
7
SENet: A deep learning framework for discriminating super- and typical enhancers by sequence information.SENet:一种基于序列信息区分超级增强子和典型增强子的深度学习框架。
Comput Biol Chem. 2023 Aug;105:107905. doi: 10.1016/j.compbiolchem.2023.107905. Epub 2023 Jun 11.
8
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
9
Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding.基于 k- -mer 嵌入卷积长短期记忆网络的染色质可及性预测。
Bioinformatics. 2017 Jul 15;33(14):i92-i101. doi: 10.1093/bioinformatics/btx234.
10
Enhancer prediction with histone modification marks using a hybrid neural network model.基于组蛋白修饰标记的增强子预测的混合神经网络模型。
Methods. 2019 Aug 15;166:48-56. doi: 10.1016/j.ymeth.2019.03.014. Epub 2019 Mar 21.

引用本文的文献

1
Adversarial attack of sequence-free enhancer prediction identifies chromatin architecture.无序列增强子预测的对抗性攻击可识别染色质结构。
Bioinformatics. 2025 Jul 1;41(7). doi: 10.1093/bioinformatics/btaf371.
2
CREATE: cell-type-specific cis-regulatory element identification via discrete embedding.CREATE:通过离散嵌入进行细胞类型特异性顺式调控元件识别
Nat Commun. 2025 May 17;16(1):4607. doi: 10.1038/s41467-025-59780-5.
3
Machine Learning Prediction of Non-Coding Variant Impact in Cell-Class-Specific Human Retinal -Regulatory Elements.

本文引用的文献

1
Simultaneous deep generative modeling and clustering of single cell genomic data.单细胞基因组数据的同步深度生成建模与聚类
Nat Mach Intell. 2021 Jun;3(6):536-544. doi: 10.1038/s42256-021-00333-y. Epub 2021 May 10.
2
Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.利用深度神经网络从基因组序列预测增强子-启动子相互作用。
Quant Biol. 2019 Jun;7(2):122-137. doi: 10.1007/s40484-019-0154-0.
3
OpenAnnotate: a web server to annotate the chromatin accessibility of genomic regions.OpenAnnotate:一个用于注释基因组区域染色质可及性的网络服务器。
机器学习预测细胞类型特异性人类视网膜调控元件中非编码变异的影响
bioRxiv. 2025 Feb 24:2025.02.22.638679. doi: 10.1101/2025.02.22.638679.
4
Enhancer reprogramming: critical roles in cancer and promising therapeutic strategies.增强子重编程:在癌症中的关键作用及有前景的治疗策略
Cell Death Discov. 2025 Mar 3;11(1):84. doi: 10.1038/s41420-025-02366-3.
5
A deep learning model for DNA enhancer prediction based on nucleotide position aware feature encoding.一种基于核苷酸位置感知特征编码的DNA增强子预测深度学习模型。
iScience. 2024 May 19;27(6):110030. doi: 10.1016/j.isci.2024.110030. eCollection 2024 Jun 21.
6
Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights.在解析转录调控网络的计算和实验方法方面的进展:理解顺式调控元件的作用至关重要,最近利用 MPRAs、STARR-seq、CRISPR-Cas9 和机器学习的研究提供了有价值的见解。
Bioessays. 2024 Jul;46(7):e2300210. doi: 10.1002/bies.202300210. Epub 2024 May 8.
7
OpenAnnotateApi: Python and R packages to efficiently annotate and analyze chromatin accessibility of genomic regions.OpenAnnotateApi:用于高效注释和分析基因组区域染色质可及性的Python和R包。
Bioinform Adv. 2024 Apr 10;4(1):vbae055. doi: 10.1093/bioadv/vbae055. eCollection 2024.
8
Deep learning for detecting and elucidating human T-cell leukemia virus type 1 integration in the human genome.用于检测和阐明人类基因组中1型人类T细胞白血病病毒整合情况的深度学习。
Patterns (N Y). 2023 Feb 10;4(2):100674. doi: 10.1016/j.patter.2022.100674.
9
PlantCADB: A Comprehensive Plant Chromatin Accessibility Database.PlantCADB:一个全面的植物染色质可及性数据库。
Genomics Proteomics Bioinformatics. 2023 Apr;21(2):311-323. doi: 10.1016/j.gpb.2022.10.005. Epub 2022 Oct 31.
10
Machine Learning Prediction of Non-Coding Variant Impact in Human Retinal cis-Regulatory Elements.机器学习预测人类视网膜顺式调控元件中非编码变异的影响。
Transl Vis Sci Technol. 2022 Apr 1;11(4):16. doi: 10.1167/tvst.11.4.16.
Nucleic Acids Res. 2021 Jul 2;49(W1):W483-W490. doi: 10.1093/nar/gkab337.
4
RA3 is a reference-guided approach for epigenetic characterization of single cells.RA3 是一种基于参考的单细胞表观遗传学特征分析方法。
Nat Commun. 2021 Apr 12;12(1):2177. doi: 10.1038/s41467-021-22495-4.
5
SilencerDB: a comprehensive database of silencers.SilencerDB:一个全面的沉默子数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D221-D228. doi: 10.1093/nar/gkaa839.
6
A method for scoring the cell type-specific impacts of noncoding variants in personal genomes.一种用于对个人基因组中非编码变异的细胞类型特异性影响进行评分的方法。
Proc Natl Acad Sci U S A. 2020 Sep 1;117(35):21364-21372. doi: 10.1073/pnas.1922703117. Epub 2020 Aug 17.
7
DC3 is a method for deconvolution and coupled clustering from bulk and single-cell genomics data.DC3 是一种从生物群体和单细胞基因组学数据中进行去卷积和耦连聚类的方法。
Nat Commun. 2019 Oct 10;10(1):4613. doi: 10.1038/s41467-019-12547-1.
8
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
9
DeepTACT: predicting 3D chromatin contacts via bootstrapping deep learning.DeepTACT:通过自举深度学习预测 3D 染色质接触。
Nucleic Acids Res. 2019 Jun 4;47(10):e60. doi: 10.1093/nar/gkz167.
10
DeepCRISPR: optimized CRISPR guide RNA design by deep learning.DeepCRISPR:通过深度学习优化 CRISPR 向导 RNA 设计。
Genome Biol. 2018 Jun 26;19(1):80. doi: 10.1186/s13059-018-1459-4.