从开放染色质区域预测细胞类型特异性基因表达。

Predicting cell-type-specific gene expression from regions of open chromatin.

机构信息

Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27708, USA.

出版信息

Genome Res. 2012 Sep;22(9):1711-22. doi: 10.1101/gr.135129.111.

DOI:10.1101/gr.135129.111

PMID:22955983

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3431488/

Abstract

Complex patterns of cell-type-specific gene expression are thought to be achieved by combinatorial binding of transcription factors (TFs) to sequence elements in regulatory regions. Predicting cell-type-specific expression in mammals has been hindered by the oftentimes unknown location of distal regulatory regions. To alleviate this bottleneck, we used DNase-seq data from 19 diverse human cell types to identify proximal and distal regulatory elements at genome-wide scale. Matched expression data allowed us to separate genes into classes of cell-type-specific up-regulated, down-regulated, and constitutively expressed genes. CG dinucleotide content and DNA accessibility in the promoters of these three classes of genes displayed substantial differences, highlighting the importance of including these aspects in modeling gene expression. We associated DNase I hypersensitive sites (DHSs) with genes, and trained classifiers for different expression patterns. TF sequence motif matches in DHSs provided a strong performance improvement in predicting gene expression over the typical baseline approach of using proximal promoter sequences. In particular, we achieved competitive performance when discriminating up-regulated genes from different cell types or genes up- and down-regulated under the same conditions. We identified previously known and new candidate cell-type-specific regulators. The models generated testable predictions of activating or repressive functions of regulators. DNase I footprints for these regulators were indicative of their direct binding to DNA. In summary, we successfully used information of open chromatin obtained by a single assay, DNase-seq, to address the problem of predicting cell-type-specific gene expression in mammalian organisms directly from regulatory sequence.

摘要

细胞类型特异性基因表达的复杂模式被认为是通过转录因子（TFs）与调控区域中的序列元件的组合结合来实现的。由于远距离调控区域的位置通常未知，因此预测哺乳动物的细胞类型特异性表达受到了阻碍。为了缓解这一瓶颈，我们使用来自 19 种不同人类细胞类型的 DNase-seq 数据，在全基因组范围内识别近端和远端调控元件。匹配的表达数据使我们能够将基因分为细胞类型特异性上调、下调和组成型表达基因的类别。这三类基因的启动子中的 CG 二核苷酸含量和 DNA 可及性显示出显著差异，突出了在建模基因表达时包含这些方面的重要性。我们将 DNase I 超敏位点（DHSs）与基因相关联，并为不同的表达模式训练分类器。DHSs 中的 TF 序列基序匹配在预测基因表达方面提供了比使用近端启动子序列的典型基线方法更强的性能改进。特别是，当区分不同细胞类型的上调基因或在相同条件下上调和下调的基因时，我们取得了有竞争力的性能。我们确定了先前已知和新的候选细胞类型特异性调节剂。这些模型生成的激活或抑制调节剂功能的测试预测。这些调节剂的 DNase I 足迹表明它们直接与 DNA 结合。总之，我们成功地使用了通过单一测定（DNase-seq）获得的开放染色质信息，直接从调控序列解决了预测哺乳动物生物中细胞类型特异性基因表达的问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/48b3/3431488/c375a61fd126/1711fig1.jpg

相似文献

Predicting cell-type-specific gene expression from regions of open chromatin.从开放染色质区域预测细胞类型特异性基因表达。

Genome Res. 2012 Sep;22(9):1711-22. doi: 10.1101/gr.135129.111.

Genomic Footprinting Analyses from DNase-seq Data to Construct Gene Regulatory Networks.从 DNase-seq 数据进行基因组足迹分析以构建基因调控网络。

Methods Mol Biol. 2021;2328:25-46. doi: 10.1007/978-1-0716-1534-8_3.

The accessible chromatin landscape of the human genome.人类基因组的可及染色质景观。

Nature. 2012 Sep 6;489(7414):75-82. doi: 10.1038/nature11232.

BinDNase: a discriminatory approach for transcription factor binding prediction using DNase I hypersensitivity data.BinDNase：一种利用DNA酶I超敏反应数据进行转录因子结合预测的鉴别方法。

Bioinformatics. 2015 Sep 1;31(17):2852-9. doi: 10.1093/bioinformatics/btv294. Epub 2015 May 7.

Sequence and chromatin determinants of cell-type-specific transcription factor binding.细胞类型特异性转录因子结合的序列和染色质决定因素。

Genome Res. 2012 Sep;22(9):1723-34. doi: 10.1101/gr.127712.111.

Genome-wide discovery of active regulatory elements and transcription factor footprints in using DNase-seq.使用 DNase-seq 在中进行全基因组活性调控元件和转录因子足迹的发现。

Genome Res. 2017 Dec;27(12):2108-2119. doi: 10.1101/gr.223735.117. Epub 2017 Oct 26.

Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors.119 个人类转录因子结合的基因组区域的序列特征和染色质结构。

Genome Res. 2012 Sep;22(9):1798-812. doi: 10.1101/gr.139105.112.

Explicit DNase sequence bias modeling enables high-resolution transcription factor footprint detection.明确的脱氧核糖核酸酶序列偏差建模可实现高分辨率转录因子足迹检测。

Nucleic Acids Res. 2014 Oct 29;42(19):11865-78. doi: 10.1093/nar/gku810. Epub 2014 Oct 7.

Genome-wide mapping of DNase I hypersensitive sites revealed differential chromatin accessibility and regulatory DNA elements under drought stress in rice cultivars.全基因组范围内的 DNase I 超敏位点作图揭示了干旱胁迫下水稻品种中差异染色质可及性和调控 DNA 元件。

Plant J. 2024 Aug;119(4):2063-2079. doi: 10.1111/tpj.16864. Epub 2024 Jun 10.

Patterns of regulatory activity across diverse human cell types predict tissue identity, transcription factor binding, and long-range interactions.多种人类细胞类型的调控活性模式可预测组织特征、转录因子结合和长程相互作用。

Genome Res. 2013 May;23(5):777-88. doi: 10.1101/gr.152140.112. Epub 2013 Mar 12.

引用本文的文献

DualNetM: an adaptive dual network framework for inferring functional-oriented markers.DualNetM：一种用于推断功能导向标记的自适应双网络框架。

BMC Biol. 2025 Aug 12;23(1):254. doi: 10.1186/s12915-025-02367-9.

Chromatin landscape in paired human visceral and subcutaneous adipose tissue and its impact on clinical variables in obesity.配对的人体内脏和皮下脂肪组织中的染色质景观及其对肥胖临床变量的影响。

EBioMedicine. 2025 Apr;114:105653. doi: 10.1016/j.ebiom.2025.105653. Epub 2025 Mar 20.

iPSCs and iPSC-derived cells as a model of human genetic and epigenetic variation.诱导多能干细胞及诱导多能干细胞衍生细胞作为人类遗传和表观遗传变异的模型。

Nat Commun. 2025 Feb 18;16(1):1750. doi: 10.1038/s41467-025-56569-4.

Macrophage memory emerges from coordinated transcription factor and chromatin dynamics.巨噬细胞记忆源于协调的转录因子和染色质动力学。

Cell Syst. 2025 Feb 19;16(2):101171. doi: 10.1016/j.cels.2025.101171. Epub 2025 Feb 11.

Trithorax regulates long-term memory in Drosophila through epigenetic maintenance of mushroom body metabolic state and translation capacity.三体胸蛋白通过对蕈形体代谢状态和翻译能力的表观遗传维持来调控果蝇的长期记忆。

PLoS Biol. 2025 Jan 27;23(1):e3003004. doi: 10.1371/journal.pbio.3003004. eCollection 2025 Jan.

A multi-regional human brain atlas of chromatin accessibility and gene expression facilitates promoter-isoform resolution genetic fine-mapping.多区域人类大脑染色质可及性和基因表达图谱有助于促进启动子-异构体分辨率的遗传精细映射。

Nat Commun. 2024 Nov 22;15(1):10113. doi: 10.1038/s41467-024-54448-y.

High-throughput optimized prime editing mediated endogenous protein tagging for pooled imaging of protein localization.高通量优化的碱基编辑介导的内源性蛋白质标记用于蛋白质定位的汇集成像

bioRxiv. 2024 Sep 17:2024.09.16.613361. doi: 10.1101/2024.09.16.613361.

Predicting gene expression state and prioritizing putative enhancers using 5hmC signal.基于 5hmC 信号预测基因表达状态和优先考虑潜在增强子。

Genome Biol. 2024 Jun 3;25(1):142. doi: 10.1186/s13059-024-03273-z.

Genomic transcription factor binding site selection is edited by the chromatin remodeling factor CHD4.基因组转录因子结合位点的选择由染色质重塑因子CHD4进行编辑。

Nucleic Acids Res. 2024 Apr 24;52(7):3607-3622. doi: 10.1093/nar/gkae025.

Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models.新型转录因子结合模型揭示的 DNA 甲基化和基序内依赖的广泛影响。

Nucleic Acids Res. 2023 Oct 13;51(18):e95. doi: 10.1093/nar/gkad693.

本文引用的文献

OTX2 directly activates cell cycle genes and inhibits differentiation in medulloblastoma cells.OTX2 直接激活细胞周期基因并抑制成神经管细胞瘤细胞的分化。

Int J Cancer. 2012 Jul 15;131(2):E21-32. doi: 10.1002/ijc.26474. Epub 2011 Nov 8.

Deletion of genes encoding PU.1 and Spi-B in B cells impairs differentiation and induces pre-B cell acute lymphoblastic leukemia.B 细胞中编码 PU.1 和 Spi-B 的基因缺失会损害分化，并诱导前 B 细胞急性淋巴细胞白血病。

Blood. 2011 Sep 8;118(10):2801-8. doi: 10.1182/blood-2011-02-335539. Epub 2011 Jul 18.

Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity.由 DNaseI 和 FAIRE 定义的开放染色质可识别出塑造细胞类型特征的调控元件。

Genome Res. 2011 Oct;21(10):1757-67. doi: 10.1101/gr.121541.111. Epub 2011 Jul 12.

Molecular mechanism underlying the differential MYF6 expression in postnatal skeletal muscle of Duroc and Pietrain breeds.杜洛克和皮特兰品种出生后骨骼肌中 MYF6 表达差异的分子机制。

Gene. 2011 Oct 15;486(1-2):8-14. doi: 10.1016/j.gene.2011.06.031. Epub 2011 Jul 2.

A regulatory circuitry comprised of miR-302 and the transcription factors OCT4 and NR2F2 regulates human embryonic stem cell differentiation.一个由 miR-302 和转录因子 OCT4、NR2F2 组成的调控回路调控着人类胚胎干细胞的分化。

EMBO J. 2011 Jan 19;30(2):237-48. doi: 10.1038/emboj.2010.319. Epub 2010 Dec 10.

Accurate inference of transcription factor binding from DNA sequence and chromatin accessibility data.从 DNA 序列和染色质可及性数据中准确推断转录因子结合。

Genome Res. 2011 Mar;21(3):447-55. doi: 10.1101/gr.112623.110. Epub 2010 Nov 24.

High-resolution genome-wide in vivo footprinting of diverse transcription factors in human cells.高分辨率全基因组活体内足迹分析鉴定人细胞中多样化的转录因子。

Genome Res. 2011 Mar;21(3):456-64. doi: 10.1101/gr.112656.110. Epub 2010 Nov 24.

Tissue-specific disallowance of housekeeping genes: the other face of cell differentiation.组织特异性管家基因抑制：细胞分化的另一面。

Genome Res. 2011 Jan;21(1):95-105. doi: 10.1101/gr.109173.110. Epub 2010 Nov 18.

Genomics tools for unraveling chromosome architecture.用于解析染色体结构的基因组学工具。

Nat Biotechnol. 2010 Oct;28(10):1089-95. doi: 10.1038/nbt.1680.

Determining the specificity of protein-DNA interactions.确定蛋白质-DNA 相互作用的特异性。

Nat Rev Genet. 2010 Nov;11(11):751-60. doi: 10.1038/nrg2845. Epub 2010 Sep 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从开放染色质区域预测细胞类型特异性基因表达。

Predicting cell-type-specific gene expression from regions of open chromatin.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献