• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SEAMoD:一种用于差异表达基因顺式调控分析的完全可解释神经网络。

SEAMoD: A fully interpretable neural network for cis-regulatory analysis of differentially expressed genes.

作者信息

Bhogale Shounak, Seward Chris, Stubbs Lisa, Sinha Saurabh

机构信息

Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.

Pacific Northwest Research Insititute, Seattle WA 98122.

出版信息

bioRxiv. 2023 Nov 13:2023.11.09.565900. doi: 10.1101/2023.11.09.565900.

DOI:10.1101/2023.11.09.565900
PMID:38014229
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10680628/
Abstract

A common way to investigate gene regulatory mechanisms is to identify differentially expressed genes using transcriptomics, find their candidate enhancers using epigenomics, and search for over-represented transcription factor (TF) motifs in these enhancers using bioinformatics tools. A related follow-up task is to model gene expression as a function of enhancer sequences and rank TF motifs by their contribution to such models, thus prioritizing among regulators. We present a new computational tool called SEAMoD that performs the above tasks of motif finding and sequence-to-expression modeling simultaneously. It trains a convolutional neural network model to relate enhancer sequences to differential expression in one or more biological conditions. The model uses TF motifs to interpret the sequences, learning these motifs and their relative importance to each biological condition from data. It also utilizes epigenomic information in the form of activity scores of putative enhancers and automatically searches for the most promising enhancer for each gene. Compared to existing neural network models of non-coding sequences, SEAMoD uses far fewer parameters, requires far less training data, and emphasizes biological interpretability. We used SEAMoD to understand regulatory mechanisms underlying the differentiation of neural stem cell (NSC) derived from mouse forebrain. We profiled gene expression and histone modifications in NSC and three differentiated cell types and used SEAMoD to model differential expression of nearly 12,000 genes with an accuracy of 81%, in the process identifying the Olig2, E2f family TFs, Foxo3, and Tcf4 as key transcriptional regulators of the differentiation process.

摘要

研究基因调控机制的一种常见方法是,利用转录组学鉴定差异表达基因,利用表观基因组学找到它们的候选增强子,并使用生物信息学工具在这些增强子中搜索过度富集的转录因子(TF)基序。一个相关的后续任务是将基因表达建模为增强子序列的函数,并根据TF基序对这种模型的贡献对其进行排序,从而在调控因子中进行优先级排序。我们提出了一种名为SEAMoD的新计算工具,它能同时执行上述基序查找和序列到表达建模的任务。它训练一个卷积神经网络模型,将增强子序列与一种或多种生物学条件下的差异表达联系起来。该模型利用TF基序来解释序列,从数据中学习这些基序及其对每种生物学条件的相对重要性。它还利用推定增强子活性评分形式的表观基因组信息,并自动为每个基因搜索最有前景的增强子。与现有的非编码序列神经网络模型相比,SEAMoD使用的参数要少得多,所需的训练数据也要少得多,并且强调生物学可解释性。我们使用SEAMoD来理解源自小鼠前脑的神经干细胞(NSC)分化的调控机制。我们分析了NSC和三种分化细胞类型中的基因表达和组蛋白修饰,并使用SEAMoD对近12000个基因的差异表达进行建模,准确率达到81%,在此过程中确定Olig2、E2f家族转录因子、Foxo3和Tcf4是分化过程的关键转录调控因子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/9285278c4c6b/nihpp-2023.11.09.565900v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/ad3013b1229d/nihpp-2023.11.09.565900v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/70198515b430/nihpp-2023.11.09.565900v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/c914c31cf9ec/nihpp-2023.11.09.565900v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/9285278c4c6b/nihpp-2023.11.09.565900v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/ad3013b1229d/nihpp-2023.11.09.565900v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/70198515b430/nihpp-2023.11.09.565900v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/c914c31cf9ec/nihpp-2023.11.09.565900v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9d24/10680628/9285278c4c6b/nihpp-2023.11.09.565900v1-f0004.jpg

相似文献

1
SEAMoD: A fully interpretable neural network for cis-regulatory analysis of differentially expressed genes.SEAMoD:一种用于差异表达基因顺式调控分析的完全可解释神经网络。
bioRxiv. 2023 Nov 13:2023.11.09.565900. doi: 10.1101/2023.11.09.565900.
2
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.
3
Information content differentiates enhancers from silencers in mouse photoreceptors.信息内容将增强子与小鼠光感受器中的沉默子区分开来。
Elife. 2021 Sep 6;10:e67403. doi: 10.7554/eLife.67403.
4
Characterization of sequence determinants of enhancer function using natural genetic variation.利用自然遗传变异对增强子功能的序列决定因素进行表征。
Elife. 2022 Aug 31;11:e76500. doi: 10.7554/eLife.76500.
5
Characterization of the neural stem cell gene regulatory network identifies OLIG2 as a multifunctional regulator of self-renewal.神经干细胞基因调控网络的特征鉴定表明OLIG2是自我更新的多功能调节因子。
Genome Res. 2015 Jan;25(1):41-56. doi: 10.1101/gr.173435.114. Epub 2014 Oct 7.
6
Enhancing the interpretability of transcription factor binding site prediction using attention mechanism.利用注意力机制提高转录因子结合位点预测的可解释性。
Sci Rep. 2020 Aug 7;10(1):13413. doi: 10.1038/s41598-020-70218-4.
7
cis-regulatory analysis of the Drosophila pdm locus reveals a diversity of neural enhancers.果蝇pdm基因座的顺式调控分析揭示了多种神经增强子。
BMC Genomics. 2015 Sep 16;16(1):700. doi: 10.1186/s12864-015-1897-2.
8
Genome-wide identification and characterization of DNA enhancers with a stacked multivariate fusion framework.基于堆叠多元融合框架的全基因组 DNA 增强子识别与特征分析。
PLoS Comput Biol. 2022 Dec 15;18(12):e1010779. doi: 10.1371/journal.pcbi.1010779. eCollection 2022 Dec.
9
Dissection of thousands of cell type-specific enhancers identifies dinucleotide repeat motifs as general enhancer features.对数千个细胞类型特异性增强子的剖析将二核苷酸重复基序鉴定为一般增强子特征。
Genome Res. 2014 Jul;24(7):1147-56. doi: 10.1101/gr.169243.113. Epub 2014 Apr 8.
10
Integrating distal and proximal information to predict gene expression via a densely connected convolutional neural network.通过密集连接卷积神经网络整合远端和近端信息来预测基因表达。
Bioinformatics. 2020 Jan 15;36(2):496-503. doi: 10.1093/bioinformatics/btz562.

本文引用的文献

1
Vitamin D Promotes Remyelination by Suppressing c-Myc and Inducing Oligodendrocyte Precursor Cell Differentiation after Traumatic Spinal Cord Injury.维生素 D 通过抑制 c-Myc 和诱导少突胶质前体细胞分化促进创伤性脊髓损伤后的髓鞘再生。
Int J Biol Sci. 2022 Aug 29;18(14):5391-5404. doi: 10.7150/ijbs.73673. eCollection 2022.
2
Reprogramming neurons for regeneration: The fountain of youth.神经元重编程促进再生:青春之泉。
Prog Neurobiol. 2022 Jul;214:102284. doi: 10.1016/j.pneurobio.2022.102284. Epub 2022 May 6.
3
Transcription Factor 4 loss-of-function is associated with deficits in progenitor proliferation and cortical neuron content.
转录因子 4 功能丧失与祖细胞增殖和皮质神经元含量缺陷有关。
Nat Commun. 2022 May 2;13(1):2387. doi: 10.1038/s41467-022-29942-w.
4
Thermodynamics-based modeling reveals regulatory effects of indirect transcription factor-DNA binding.基于热力学的建模揭示了间接转录因子与DNA结合的调控作用。
iScience. 2022 Mar 24;25(5):104152. doi: 10.1016/j.isci.2022.104152. eCollection 2022 May 20.
5
An epigenomic shift in amygdala marks the transition to maternal behaviors in alloparenting virgin female mice.在孤雌育幼的处女雌鼠中,杏仁核中的表观基因组发生转变标志着向亲代行为的转变。
PLoS One. 2022 Feb 22;17(2):e0263632. doi: 10.1371/journal.pone.0263632. eCollection 2022.
6
JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles.JASPAR 2022:转录因子结合谱开放获取数据库的第 9 个版本。
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173. doi: 10.1093/nar/gkab1113.
7
Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用,从序列中有效预测基因表达。
Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.
8
Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks.使用基于热力学模型和卷积神经网络破译增强子序列。
Nucleic Acids Res. 2021 Oct 11;49(18):10309-10327. doi: 10.1093/nar/gkab765.
9
Base-resolution models of transcription-factor binding reveal soft motif syntax.基于分辨率的转录因子结合模型揭示了软基序语法。
Nat Genet. 2021 Mar;53(3):354-366. doi: 10.1038/s41588-021-00782-6. Epub 2021 Feb 18.
10
Individual differences in honey bee behavior enabled by plasticity in brain gene regulatory networks.脑基因调控网络可塑性使蜜蜂行为存在个体差异。
Elife. 2020 Dec 22;9:e62850. doi: 10.7554/eLife.62850.