• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DeepCAGE:在全基因组预测染色质可及性中纳入转录因子。

DeepCAGE: Incorporating Transcription Factors in Genome-wide Prediction of Chromatin Accessibility.

机构信息

Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China; Department of Statistics, Stanford University, Stanford, CA 94305, USA.

Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China.

出版信息

Genomics Proteomics Bioinformatics. 2022 Jun;20(3):496-507. doi: 10.1016/j.gpb.2021.08.015. Epub 2022 Mar 12.

DOI:10.1016/j.gpb.2021.08.015
PMID:35293310
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9801045/
Abstract

Although computational approaches have been complementing high-throughput biological experiments for the identification of functional regions in the human genome, it remains a great challenge to systematically decipher interactions between transcription factors (TFs) and regulatory elements to achieve interpretable annotations of chromatin accessibility across diverse cellular contexts. To solve this problem, we propose DeepCAGE, a deep learning framework that integrates sequence information and binding statuses of TFs, for the accurate prediction of chromatin accessible regions at a genome-wide scale in a variety of cell types. DeepCAGE takes advantage of a densely connected deep convolutional neural network architecture to automatically learn sequence signatures of known chromatin accessible regions and then incorporates such features with expression levels and binding activities of human core TFs to predict novel chromatin accessible regions. In a series of systematic comparisons with existing methods, DeepCAGE exhibits superior performance in not only the classification but also the regression of chromatin accessibility signals. In a detailed analysis of TF activities, DeepCAGE successfully extracts novel binding motifs and measures the contribution of a TF to the regulation with respect to a specific locus in a certain cell type. When applied to whole-genome sequencing data analysis, our method successfully prioritizes putative deleterious variants underlying a human complex trait and thus provides insights into the understanding of disease-associated genetic variants. DeepCAGE can be downloaded from https://github.com/kimmo1019/DeepCAGE.

摘要

虽然计算方法一直在为识别人类基因组中的功能区域补充高通量生物学实验,但系统破译转录因子 (TF) 和调控元件之间的相互作用,以实现对不同细胞环境下染色质可及性的可解释注释,仍然是一个巨大的挑战。为了解决这个问题,我们提出了 DeepCAGE,这是一个深度学习框架,它整合了序列信息和 TF 的结合状态,用于在多种细胞类型中准确预测全基因组范围内的染色质可及区域。DeepCAGE 利用密集连接的深度卷积神经网络架构,自动学习已知染色质可及区域的序列特征,然后将这些特征与人类核心 TF 的表达水平和结合活性相结合,以预测新的染色质可及区域。在与现有方法的一系列系统比较中,DeepCAGE 在分类和回归染色质可及性信号方面都表现出了优异的性能。在对 TF 活性的详细分析中,DeepCAGE 成功提取了新的结合基序,并测量了 TF 相对于特定细胞类型中特定基因座的调控贡献。当应用于全基因组测序数据分析时,我们的方法成功地确定了人类复杂性状潜在的有害变异,从而深入了解与疾病相关的遗传变异。DeepCAGE 可从 https://github.com/kimmo1019/DeepCAGE 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/3205ba3fdd0d/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/6ea303883ee2/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/ea437b59b75d/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/616ef5f72d40/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/c6e8def3896f/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/552f65d2354a/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/3205ba3fdd0d/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/6ea303883ee2/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/ea437b59b75d/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/616ef5f72d40/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/c6e8def3896f/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/552f65d2354a/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/808c/9801045/3205ba3fdd0d/gr6.jpg

相似文献

1
DeepCAGE: Incorporating Transcription Factors in Genome-wide Prediction of Chromatin Accessibility.DeepCAGE:在全基因组预测染色质可及性中纳入转录因子。
Genomics Proteomics Bioinformatics. 2022 Jun;20(3):496-507. doi: 10.1016/j.gpb.2021.08.015. Epub 2022 Mar 12.
2
Chromatin accessibility prediction via a hybrid deep convolutional neural network.基于混合深度卷积神经网络的染色质可及性预测。
Bioinformatics. 2018 Mar 1;34(5):732-738. doi: 10.1093/bioinformatics/btx679.
3
Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility.基于染色质可及性评估预测转录因子结合位点的模型可转移性。
BMC Bioinformatics. 2017 Jul 27;18(1):355. doi: 10.1186/s12859-017-1769-7.
4
REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data.REUNION:从单细胞多组学数据中进行转录因子结合预测和调控关联推断。
Bioinformatics. 2024 Jun 28;40(Suppl 1):i567-i575. doi: 10.1093/bioinformatics/btae234.
5
Sequence and chromatin determinants of cell-type-specific transcription factor binding.细胞类型特异性转录因子结合的序列和染色质决定因素。
Genome Res. 2012 Sep;22(9):1723-34. doi: 10.1101/gr.127712.111.
6
Role of chromatin and transcriptional co-regulators in mediating p63-genome interactions in keratinocytes.染色质和转录共调节因子在介导角质形成细胞中p63与基因组相互作用中的作用。
BMC Genomics. 2014 Nov 29;15(1):1042. doi: 10.1186/1471-2164-15-1042.
7
ATAC-pipe: general analysis of genome-wide chromatin accessibility.ATAC-pipe:全基因组染色质可及性的综合分析。
Brief Bioinform. 2019 Sep 27;20(5):1934-1943. doi: 10.1093/bib/bby056.
8
Genome-Wide Measurement and Computational Analysis of Transcription Factor Binding and Chromatin Accessibility in Lymphocytes.淋巴细胞中转录因子结合和染色质可及性的全基因组测量与计算分析
Curr Protoc Immunol. 2019 Sep;126(1):e84. doi: 10.1002/cpim.84.
9
Base-resolution prediction of transcription factor binding signals by a deep learning framework.基于深度学习框架的转录因子结合信号的碱基分辨率预测。
PLoS Comput Biol. 2022 Mar 9;18(3):e1009941. doi: 10.1371/journal.pcbi.1009941. eCollection 2022 Mar.
10
Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data.从 DNA 序列和染色质可及性数据中准确推断转录因子结合。
Genome Res. 2011 Mar;21(3):447-55. doi: 10.1101/gr.112623.110. Epub 2010 Nov 24.

引用本文的文献

1
Modeling combinatorial regulation from single-cell multi-omics provides regulatory units underpinning cell type landscape using cRegulon.利用单细胞多组学进行组合调控建模,通过cRegulon提供支撑细胞类型格局的调控单元。
Genome Biol. 2025 Jul 24;26(1):220. doi: 10.1186/s13059-025-03680-w.
2
A multi-modal transformer for cell type-agnostic regulatory predictions.一种用于细胞类型无关调节预测的多模态变压器。
Cell Genom. 2025 Feb 12;5(2):100762. doi: 10.1016/j.xgen.2025.100762. Epub 2025 Jan 29.
3
EpiGePT: a pretrained transformer-based language model for context-specific human epigenomics.

本文引用的文献

1
Simultaneous deep generative modeling and clustering of single cell genomic data.单细胞基因组数据的同步深度生成建模与聚类
Nat Mach Intell. 2021 Jun;3(6):536-544. doi: 10.1038/s42256-021-00333-y. Epub 2021 May 10.
2
Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.利用深度神经网络从基因组序列预测增强子-启动子相互作用。
Quant Biol. 2019 Jun;7(2):122-137. doi: 10.1007/s40484-019-0154-0.
3
OpenAnnotate: a web server to annotate the chromatin accessibility of genomic regions.OpenAnnotate:一个用于注释基因组区域染色质可及性的网络服务器。
EpiGePT:一种用于特定背景人类表观基因组学的基于预训练Transformer的语言模型。
Genome Biol. 2024 Dec 18;25(1):310. doi: 10.1186/s13059-024-03449-7.
4
ctGAN: combined transformation of gene expression and survival data with generative adversarial network.ctGAN:利用生成对抗网络对基因表达和生存数据进行联合变换。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae325.
5
DeepSATA: A Deep Learning-Based Sequence Analyzer Incorporating the Transcription Factor Binding Affinity to Dissect the Effects of Non-Coding Genetic Variants.DeepSATA:一种基于深度学习的序列分析器,结合转录因子结合亲和力来剖析非编码遗传变异的影响。
Int J Mol Sci. 2023 Jul 27;24(15):12023. doi: 10.3390/ijms241512023.
6
EpiGePT: a Pretrained Transformer model for epigenomics.EpiGePT:一种用于表观基因组学的预训练Transformer模型。
bioRxiv. 2024 Feb 3:2023.07.15.549134. doi: 10.1101/2023.07.15.549134.
7
Multi-Cell-Type Openness-Weighted Association Studies for Trait-Associated Genomic Segments Prioritization.多细胞类型开放式关联研究,用于优先考虑与特征相关的基因组片段。
Genes (Basel). 2022 Jul 8;13(7):1220. doi: 10.3390/genes13071220.
8
Openness weighted association studies: leveraging personal genome information to prioritize non-coding variants.基于开放权重的关联研究:利用个人基因组信息对非编码变异进行优先级排序。
Bioinformatics. 2021 Dec 11;37(24):4737-4743. doi: 10.1093/bioinformatics/btab514.
Nucleic Acids Res. 2021 Jul 2;49(W1):W483-W490. doi: 10.1093/nar/gkab337.
4
RA3 is a reference-guided approach for epigenetic characterization of single cells.RA3 是一种基于参考的单细胞表观遗传学特征分析方法。
Nat Commun. 2021 Apr 12;12(1):2177. doi: 10.1038/s41467-021-22495-4.
5
Density estimation using deep generative neural networks.使用深度生成神经网络进行密度估计。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2101344118.
6
DeepCDR: a hybrid graph convolutional network for predicting cancer drug response.DeepCDR:一种用于预测癌症药物反应的混合图卷积网络。
Bioinformatics. 2020 Dec 30;36(Suppl_2):i911-i918. doi: 10.1093/bioinformatics/btaa822.
7
SilencerDB: a comprehensive database of silencers.SilencerDB:一个全面的沉默子数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D221-D228. doi: 10.1093/nar/gkaa839.
8
Chromatin accessibility dynamics in a model of human forebrain development.人类前脑发育模型中的染色质可及性动态变化。
Science. 2020 Jan 24;367(6476). doi: 10.1126/science.aay1645.
9
Quantifying functional impact of non-coding variants with multi-task Bayesian neural network.使用多任务贝叶斯神经网络量化非编码变异的功能影响。
Bioinformatics. 2020 Mar 1;36(5):1397-1404. doi: 10.1093/bioinformatics/btz767.
10
hicGAN infers super resolution Hi-C data with generative adversarial networks.hicGAN 利用生成对抗网络对超高分辨率 Hi-C 数据进行推断。
Bioinformatics. 2019 Jul 15;35(14):i99-i107. doi: 10.1093/bioinformatics/btz317.