通过整合染色质可及性和转录组数据对因果调控网络进行建模。

Modeling the causal regulatory network by integrating chromatin accessibility and transcriptome data.

作者信息

Wang Yong, Jiang Rui, Wong Wing Hung

机构信息

Department of Statistics, Department of Biomedical Data Science, Bio-X Program, Stanford University, Stanford, CA 94305, USA.

Academy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing 100080, China.

出版信息

Natl Sci Rev. 2016 Jun;3(2):240-251. doi: 10.1093/nsr/nww025. Epub 2016 Apr 19.

DOI:10.1093/nsr/nww025

PMID:28690910

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5501464/

Abstract

Cell packs a lot of genetic and regulatory information through a structure known as chromatin, i.e. DNA is wrapped around histone proteins and is tightly packed in a remarkable way. To express a gene in a specific coding region, the chromatin would open up and DNA loop may be formed by interacting enhancers and promoters. Furthermore, the mediator and cohesion complexes, sequence-specific transcription factors, and RNA polymerase II are recruited and work together to elaborately regulate the expression level. It is in pressing need to understand how the information, about when, where, and to what degree genes should be expressed, is embedded into chromatin structure and gene regulatory elements. Thanks to large consortia such as Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomic projects, extensive data on chromatin accessibility and transcript abundance are available across many tissues and cell types. This rich data offer an exciting opportunity to model the causal regulatory relationship. Here, we will review the current experimental approaches, foundational data, computational problems, interpretive frameworks, and integrative models that will enable the accurate interpretation of regulatory landscape. Particularly, we will discuss the efforts to organize, analyze, model, and integrate the DNA accessibility data, transcriptional data, and functional genomic regions together. We believe that these efforts will eventually help us understand the information flow within the cell and will influence research directions across many fields.

摘要

细胞通过一种称为染色质的结构来存储大量的遗传和调控信息，即DNA缠绕在组蛋白上，并以一种非凡的方式紧密包装。为了在特定的编码区域表达基因，染色质会打开，DNA环可能由相互作用的增强子和启动子形成。此外，中介体和黏连蛋白复合物、序列特异性转录因子以及RNA聚合酶II会被招募并共同作用，以精细地调节表达水平。迫切需要了解关于基因何时、何地以及在何种程度上应该表达的信息是如何嵌入染色质结构和基因调控元件中的。多亏了诸如DNA元件百科全书（ENCODE）和表观基因组路线图计划等大型合作项目，现在可以获得许多组织和细胞类型中关于染色质可及性和转录本丰度的广泛数据。这些丰富的数据为建立因果调控关系模型提供了一个令人兴奋的机会。在这里，我们将回顾当前的实验方法、基础数据、计算问题、解释框架和整合模型，这些将有助于准确解释调控格局。特别是，我们将讨论将DNA可及性数据、转录数据和功能基因组区域进行组织、分析、建模和整合的工作。我们相信，这些努力最终将帮助我们理解细胞内的信息流，并将影响许多领域的研究方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7447/5501464/216c6f456a2a/nihms803976f1.jpg

相似文献

Modeling the causal regulatory network by integrating chromatin accessibility and transcriptome data.通过整合染色质可及性和转录组数据对因果调控网络进行建模。

Natl Sci Rev. 2016 Jun;3(2):240-251. doi: 10.1093/nsr/nww025. Epub 2016 Apr 19.

Accurate Promoter and Enhancer Identification in 127 ENCODE and Roadmap Epigenomics Cell Types and Tissues by GenoSTAN.通过GenoSTAN在127种ENCODE和表观基因组学路线图细胞类型及组织中准确识别启动子和增强子

PLoS One. 2017 Jan 5;12(1):e0169249. doi: 10.1371/journal.pone.0169249. eCollection 2017.

Modeling gene regulation from paired expression and chromatin accessibility data.基于表达和染色质可及性数据的基因调控建模。

Proc Natl Acad Sci U S A. 2017 Jun 20;114(25):E4914-E4923. doi: 10.1073/pnas.1704553114. Epub 2017 Jun 2.

Modelling the conditional regulatory activity of methylated and bivalent promoters.模拟甲基化和双价启动子的条件性调控活性。

Epigenetics Chromatin. 2015 Jun 19;8:21. doi: 10.1186/s13072-015-0013-9. eCollection 2015.

A Short Report on the Markov Property of DNA Sequences on 200-bp Genomic Units of ENCODE/Broad ChromHMM Annotations: A Computational Perspective.关于ENCODE/布罗德染色质状态图谱注释中200碱基对基因组单元上DNA序列马尔可夫性质的简短报告：计算视角

Genomics Inform. 2018 Sep;16(3):65-70. doi: 10.5808/GI.2018.16.3.65. Epub 2018 Sep 30.

cisDynet: An integrated platform for modeling gene-regulatory dynamics and networks.顺式动态网络（cisDynet）：一个用于模拟基因调控动力学和网络的集成平台。

Imeta. 2023 Nov 23;2(4):e152. doi: 10.1002/imt2.152. eCollection 2023 Nov.

Changing the DNA landscape: putting a SPN on chromatin.改变DNA格局：给染色质加上一个SPN

Curr Top Microbiol Immunol. 2003;274:171-201. doi: 10.1007/978-3-642-55747-7_7.

Cell Type-Specific Chromatin Signatures Underline Regulatory DNA Elements in Human Induced Pluripotent Stem Cells and Somatic Cells.细胞类型特异性染色质特征突显人类诱导多能干细胞和体细胞中的调控DNA元件。

Circ Res. 2017 Nov 10;121(11):1237-1250. doi: 10.1161/CIRCRESAHA.117.311367. Epub 2017 Oct 13.

A Short Report on the Markov Property of DNA Sequences on 200-bp Genomic Units of Roadmap Genomics ChromHMM Annotations: A Computational Perspective.关于路线图基因组学ChromHMM注释中200碱基对基因组单元上DNA序列马尔可夫性质的简短报告：计算视角

Genomics Inform. 2018 Dec;16(4):e27. doi: 10.5808/GI.2018.16.4.e27. Epub 2018 Dec 28.

Chromatin accessibility dynamics reveal novel functional enhancers in .染色质可及性动态揭示中的新型功能增强子。

Genome Res. 2017 Dec;27(12):2096-2107. doi: 10.1101/gr.226233.117. Epub 2017 Nov 15.

引用本文的文献

SemanticCAP: Chromatin Accessibility Prediction Enhanced by Features Learning from a Language Model.SemanticCAP：通过从语言模型学习的特征增强的染色质可及性预测。

Genes (Basel). 2022 Mar 23;13(4):568. doi: 10.3390/genes13040568.

DeepCAGE: Incorporating Transcription Factors in Genome-wide Prediction of Chromatin Accessibility.DeepCAGE：在全基因组预测染色质可及性中纳入转录因子。

Genomics Proteomics Bioinformatics. 2022 Jun;20(3):496-507. doi: 10.1016/j.gpb.2021.08.015. Epub 2022 Mar 12.

Epigenetics of wheat-rust interaction: an update.小麦-锈菌互作的表观遗传学研究进展。

Planta. 2022 Jan 27;255(2):50. doi: 10.1007/s00425-022-03829-y.

Targeting Chromatin Complexes in Myeloid Malignancies and Beyond: From Basic Mechanisms to Clinical Innovation.靶向髓系恶性肿瘤及其他疾病中的染色质复合物：从基础机制到临床创新。

Cells. 2020 Dec 21;9(12):2721. doi: 10.3390/cells9122721.

Global Role of Crop Genomics in the Face of Climate Change.气候变化背景下作物基因组学的全球作用

Front Plant Sci. 2020 Jul 16;11:922. doi: 10.3389/fpls.2020.00922. eCollection 2020.

From reads to insight: a hitchhiker's guide to ATAC-seq data analysis.从读取到洞察：ATAC-seq 数据分析入门指南。

Genome Biol. 2020 Feb 3;21(1):22. doi: 10.1186/s13059-020-1929-3.

Hierarchical graphical model reveals HFR1 bridging circadian rhythm and flower development in .层次图形模型揭示 HFR1 在中连接昼夜节律和花发育

NPJ Syst Biol Appl. 2019 Aug 12;5:28. doi: 10.1038/s41540-019-0106-3. eCollection 2019.

Constructing tissue-specific transcriptional regulatory networks via a Markov random field.通过马尔可夫随机场构建组织特异性转录调控网络。

BMC Genomics. 2018 Dec 31;19(Suppl 10):884. doi: 10.1186/s12864-018-5277-6.

Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding.基于 k- -mer 嵌入卷积长短期记忆网络的染色质可及性预测。

Bioinformatics. 2017 Jul 15;33(14):i92-i101. doi: 10.1093/bioinformatics/btx234.

Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization.使用具有相似性正则化的矩阵分解改进细胞系中抗癌药物反应预测。

BMC Cancer. 2017 Aug 2;17(1):513. doi: 10.1186/s12885-017-3500-5.

本文引用的文献

Genome-wide footprinting: ready for prime time?全基因组足迹分析：准备好进入黄金时代了吗？

Nat Methods. 2016 Mar;13(3):222-228. doi: 10.1038/nmeth.3766.

Genomic footprinting.基因组足迹分析。

Nat Methods. 2016 Mar;13(3):213-21. doi: 10.1038/nmeth.3768.

Single-cell chromatin accessibility reveals principles of regulatory variation.单细胞染色质可及性揭示调控变异原理。

Nature. 2015 Jul 23;523(7561):486-90. doi: 10.1038/nature14590. Epub 2015 Jun 17.

Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing.通过组合细胞索引对染色质可及性进行多重单细胞分析

Science. 2015 May 22;348(6237):910-4. doi: 10.1126/science.aab1601. Epub 2015 May 7.

Integrating motif, DNA accessibility and gene expression data to build regulatory maps in an organism.整合基序、DNA可及性和基因表达数据以构建生物体中的调控图谱。

Nucleic Acids Res. 2015 Apr 30;43(8):3998-4012. doi: 10.1093/nar/gkv195. Epub 2015 Mar 19.

Incorporating chromatin accessibility data into sequence-to-expression modeling.将染色质可及性数据纳入序列到表达的建模中。

Biophys J. 2015 Mar 10;108(5):1257-67. doi: 10.1016/j.bpj.2014.12.037.

Integrative analysis of 111 reference human epigenomes.111 个人类参考基因组的综合分析。

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

Walking on multiple disease-gene networks to prioritize candidate genes.基于多种疾病基因网络走路，优先考虑候选基因。

J Mol Cell Biol. 2015 Jun;7(3):214-30. doi: 10.1093/jmcb/mjv008. Epub 2015 Feb 13.

Discovery of transcription factors and regulatory regions driving in vivo tumor development by ATAC-seq and FAIRE-seq open chromatin profiling.通过ATAC-seq和FAIRE-seq开放染色质分析发现驱动体内肿瘤发展的转录因子和调控区域。

PLoS Genet. 2015 Feb 13;11(2):e1004994. doi: 10.1371/journal.pgen.1004994. eCollection 2015 Feb.

Integrated genome and transcriptome sequencing of the same cell.对同一细胞进行基因组和转录组的联合测序。

Nat Biotechnol. 2015 Mar;33(3):285-289. doi: 10.1038/nbt.3129. Epub 2015 Jan 19.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验