• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从组蛋白修饰中准确且高度可解释地预测基因表达。

Accurate and highly interpretable prediction of gene expression from histone modifications.

机构信息

Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy.

Department of Computing, Imperial College London, London, UK.

出版信息

BMC Bioinformatics. 2022 Apr 26;23(1):151. doi: 10.1186/s12859-022-04687-x.

DOI:10.1186/s12859-022-04687-x
PMID:35473556
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9040271/
Abstract

BACKGROUND

Histone Mark Modifications (HMs) are crucial actors in gene regulation, as they actively remodel chromatin to modulate transcriptional activity: aberrant combinatorial patterns of HMs have been connected with several diseases, including cancer. HMs are, however, reversible modifications: understanding their role in disease would allow the design of 'epigenetic drugs' for specific, non-invasive treatments. Standard statistical techniques were not entirely successful in extracting representative features from raw HM signals over gene locations. On the other hand, deep learning approaches allow for effective automatic feature extraction, but at the expense of model interpretation.

RESULTS

Here, we propose ShallowChrome, a novel computational pipeline to model transcriptional regulation via HMs in both an accurate and interpretable way. We attain state-of-the-art results on the binary classification of gene transcriptional states over 56 cell-types from the REMC database, largely outperforming recent deep learning approaches. We interpret our models by extracting insightful gene-specific regulative patterns, and we analyse them for the specific case of the PAX5 gene over three differentiated blood cell lines. Finally, we compare the patterns we obtained with the characteristic emission patterns of ChromHMM, and show that ShallowChrome is able to coherently rank groups of chromatin states w.r.t. their transcriptional activity.

CONCLUSIONS

In this work we demonstrate that it is possible to model HM-modulated gene expression regulation in a highly accurate, yet interpretable way. Our feature extraction algorithm leverages on data downstream the identification of enriched regions to retrieve gene-wise, statistically significant and dynamically located features for each HM. These features are highly predictive of gene transcriptional state, and allow for accurate modeling by computationally efficient logistic regression models. These models allow a direct inspection and a rigorous interpretation, helping to formulate quantifiable hypotheses.

摘要

背景

组蛋白修饰(HMs)是基因调控的关键因素,因为它们积极重塑染色质以调节转录活性:异常的 HM 组合模式与包括癌症在内的多种疾病有关。然而,HM 是可逆转的修饰:了解它们在疾病中的作用将允许设计针对特定、非侵入性治疗的“表观遗传药物”。标准统计技术在从基因位置的原始 HM 信号中提取代表性特征方面并不完全成功。另一方面,深度学习方法允许有效自动提取特征,但以牺牲模型解释为代价。

结果

在这里,我们提出了 ShallowChrome,这是一种新颖的计算管道,能够以准确和可解释的方式通过 HM 对转录调节进行建模。我们在 REMC 数据库中对来自 56 种细胞类型的基因转录状态进行二进制分类的任务上取得了最先进的结果,大大优于最近的深度学习方法。我们通过提取有见地的基因特异性调节模式来解释我们的模型,并针对 PAX5 基因在三个分化的血细胞系上的具体情况对其进行分析。最后,我们将我们获得的模式与 ChromHMM 的特征发射模式进行比较,并表明 ShallowChrome 能够根据它们的转录活性一致地对染色质状态组进行排序。

结论

在这项工作中,我们证明了以高度准确但可解释的方式对 HM 调节的基因表达调控进行建模是可能的。我们的特征提取算法利用鉴定富含区域的下游数据来检索每个 HM 的基因特异性、具有统计学意义且动态定位的特征。这些特征对基因转录状态具有高度预测性,并允许通过计算效率高的逻辑回归模型进行准确建模。这些模型允许直接检查和严格解释,有助于制定可量化的假设。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/c6c380669f5a/12859_2022_4687_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/f9abd9294672/12859_2022_4687_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/6fb0ee3ad918/12859_2022_4687_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/bd4a2ab6425a/12859_2022_4687_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/c6c380669f5a/12859_2022_4687_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/f9abd9294672/12859_2022_4687_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/6fb0ee3ad918/12859_2022_4687_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/bd4a2ab6425a/12859_2022_4687_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d1/9040271/c6c380669f5a/12859_2022_4687_Fig4_HTML.jpg

相似文献

1
Accurate and highly interpretable prediction of gene expression from histone modifications.从组蛋白修饰中准确且高度可解释地预测基因表达。
BMC Bioinformatics. 2022 Apr 26;23(1):151. doi: 10.1186/s12859-022-04687-x.
2
Toward breaking the histone code: bayesian graphical models for histone modifications.迈向破解组蛋白密码:用于组蛋白修饰的贝叶斯图形模型
Circ Cardiovasc Genet. 2013 Aug;6(4):419-26. doi: 10.1161/CIRCGENETICS.113.000100. Epub 2013 Jun 7.
3
DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications.DeepDiff:基于深度学习的组蛋白修饰差异基因表达预测方法。
Bioinformatics. 2018 Sep 1;34(17):i891-i900. doi: 10.1093/bioinformatics/bty612.
4
dHICA: a deep transformer-based model enables accurate histone imputation from chromatin accessibility.dHICA:一种基于深度Transformer 的模型,可从染色质可及性中实现精确的组蛋白推断。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae459.
5
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
6
Combinatorial modeling of chromatin features quantitatively predicts DNA replication timing in Drosophila.染色质特征的组合建模定量预测果蝇中的DNA复制时间。
PLoS Comput Biol. 2014 Jan;10(1):e1003419. doi: 10.1371/journal.pcbi.1003419. Epub 2014 Jan 23.
7
An integrative analysis of post-translational histone modifications in the marine diatom Phaeodactylum tricornutum.三角褐指藻中转录后组蛋白修饰的综合分析
Genome Biol. 2015 May 20;16(1):102. doi: 10.1186/s13059-015-0671-8.
8
DeepChrome: deep-learning for predicting gene expression from histone modifications.深度铬:用于从组蛋白修饰预测基因表达的深度学习
Bioinformatics. 2016 Sep 1;32(17):i639-i648. doi: 10.1093/bioinformatics/btw427.
9
Histone modifications involved in cassette exon inclusions: a quantitative and interpretable analysis.参与可变外显子包含的组蛋白修饰:定量与可解释分析
BMC Genomics. 2014 Dec 19;15(1):1148. doi: 10.1186/1471-2164-15-1148.
10
The histone code of Toxoplasma gondii comprises conserved and unique posttranslational modifications.刚地弓形虫的组蛋白密码包含保守和独特的翻译后修饰。
mBio. 2013 Dec 10;4(6):e00922-13. doi: 10.1128/mBio.00922-13.

引用本文的文献

1
Towards universal modeling of transcript isoform expression levels.迈向转录本异构体表达水平的通用建模
bioRxiv. 2025 Jul 25:2025.07.21.665977. doi: 10.1101/2025.07.21.665977.
2
Prediction of gene expression using histone modification patterns extracted by Particle Swarm Optimization.利用粒子群优化算法提取的组蛋白修饰模式预测基因表达
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf033.
3
Predicting the effect of CRISPR-Cas9-based epigenome editing.预测基于CRISPR-Cas9的表观基因组编辑效果。

本文引用的文献

1
Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
2
Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.利用深度卷积神经网络直接从基因组序列预测 mRNA 丰度。
Cell Rep. 2020 May 19;31(7):107663. doi: 10.1016/j.celrep.2020.107663.
3
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.
bioRxiv. 2025 Feb 28:2023.10.03.560674. doi: 10.1101/2023.10.03.560674.
通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.
4
Drugs Targeting Epigenetic Modifications and Plausible Therapeutic Strategies Against Colorectal Cancer.靶向表观遗传修饰的药物及针对结直肠癌的合理治疗策略
Front Pharmacol. 2019 Jun 6;10:588. doi: 10.3389/fphar.2019.00588. eCollection 2019.
5
Revealing transcription factor and histone modification co-localization and dynamics across cell lines by integrating ChIP-seq and RNA-seq data.通过整合 ChIP-seq 和 RNA-seq 数据揭示转录因子和组蛋白修饰在细胞系中的共定位和动态变化。
BMC Genomics. 2018 Dec 31;19(Suppl 10):914. doi: 10.1186/s12864-018-5278-5.
6
Gain-of-function DNMT3A mutations cause microcephalic dwarfism and hypermethylation of Polycomb-regulated regions.功能获得性 DNMT3A 突变导致小头畸形矮身材和 Polycomb 调控区域的高甲基化。
Nat Genet. 2019 Jan;51(1):96-105. doi: 10.1038/s41588-018-0274-x. Epub 2018 Nov 26.
7
DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications.DeepDiff:基于深度学习的组蛋白修饰差异基因表达预测方法。
Bioinformatics. 2018 Sep 1;34(17):i891-i900. doi: 10.1093/bioinformatics/bty612.
8
Attend and Predict: Understanding Gene Regulation by Selective Attention on Chromatin.参与并预测:通过对染色质的选择性关注理解基因调控
Adv Neural Inf Process Syst. 2017 Dec;30:6785-6795.
9
Integrating the Epigenome to Identify Drivers of Hepatocellular Carcinoma.整合表观基因组以鉴定肝细胞癌的驱动因子。
Hepatology. 2019 Feb;69(2):639-652. doi: 10.1002/hep.30211. Epub 2019 Jan 5.
10
The miR-96 and RARγ signaling axis governs androgen signaling and prostate cancer progression.miR-96 和 RARγ 信号轴调控雄激素信号和前列腺癌进展。
Oncogene. 2019 Jan;38(3):421-444. doi: 10.1038/s41388-018-0450-6. Epub 2018 Aug 17.