Lizio Marina, Ishizu Yuri, Itoh Masayoshi, Lassmann Timo, Hasegawa Akira, Kubosaki Atsutaka, Severin Jessica, Kawaji Hideya, Nakamura Yukio, Suzuki Harukazu, Hayashizaki Yoshihide, Carninci Piero, Forrest Alistair R R
RIKEN Center for Life Science Technologies Yokohama, Japan ; Division of Genomic Technologies, RIKEN Center for Life Science Technologies Yokohama, Japan.
RIKEN Center for Life Science Technologies Yokohama, Japan ; Division of Genomic Technologies, RIKEN Center for Life Science Technologies Yokohama, Japan ; RIKEN Preventive Medicine and Diagnosis Innovation Program Yokohama, Japan.
Front Genet. 2015 Nov 18;6:331. doi: 10.3389/fgene.2015.00331. eCollection 2015.
Mammals are composed of hundreds of different cell types with specialized functions. Each of these cellular phenotypes are controlled by different combinations of transcription factors. Using a human non islet cell insulinoma cell line (TC-YIK) which expresses insulin and the majority of known pancreatic beta cell specific genes as an example, we describe a general approach to identify key cell-type-specific transcription factors (TFs) and their direct and indirect targets. By ranking all human TFs by their level of enriched expression in TC-YIK relative to a broad collection of samples (FANTOM5), we confirmed known key regulators of pancreatic function and development. Systematic siRNA mediated perturbation of these TFs followed by qRT-PCR revealed their interconnections with NEUROD1 at the top of the regulation hierarchy and its depletion drastically reducing insulin levels. For 15 of the TF knock-downs (KD), we then used Cap Analysis of Gene Expression (CAGE) to identify thousands of their targets genome-wide (KD-CAGE). The data confirm NEUROD1 as a key positive regulator in the transcriptional regulatory network (TRN), and ISL1, and PROX1 as antagonists. As a complimentary approach we used ChIP-seq on four of these factors to identify NEUROD1, LMX1A, PAX6, and RFX6 binding sites in the human genome. Examining the overlap between genes perturbed in the KD-CAGE experiments and genes with a ChIP-seq peak within 50 kb of their promoter, we identified direct transcriptional targets of these TFs. Integration of KD-CAGE and ChIP-seq data shows that both NEUROD1 and LMX1A work as the main transcriptional activators. In the core TRN (i.e., TF-TF only), NEUROD1 directly transcriptionally activates the pancreatic TFs HSF4, INSM1, MLXIPL, MYT1, NKX6-3, ONECUT2, PAX4, PROX1, RFX6, ST18, DACH1, and SHOX2, while LMX1A directly transcriptionally activates DACH1, SHOX2, PAX6, and PDX1. Analysis of these complementary datasets suggests the need for caution in interpreting ChIP-seq datasets. (1) A large fraction of binding sites are at distal enhancer sites and cannot be directly associated to their targets, without chromatin conformation data. (2) Many peaks may be non-functional: even when there is a peak at a promoter, the expression of the gene may not be affected in the matching perturbation experiment.
哺乳动物由数百种具有特定功能的不同细胞类型组成。这些细胞表型中的每一种都由转录因子的不同组合控制。以表达胰岛素和大多数已知胰腺β细胞特异性基因的人非胰岛细胞胰岛素瘤细胞系(TC-YIK)为例,我们描述了一种识别关键细胞类型特异性转录因子(TFs)及其直接和间接靶标的通用方法。通过根据TC-YIK相对于大量样本(FANTOM5)中富集表达水平对所有人类TFs进行排名,我们确认了胰腺功能和发育的已知关键调节因子。对这些TFs进行系统性的siRNA介导的干扰,随后进行qRT-PCR,揭示了它们在调控层级顶部与NEUROD1的相互联系,并且其缺失会大幅降低胰岛素水平。对于15种TF敲低(KD)情况,我们随后使用基因表达帽分析(CAGE)在全基因组范围内鉴定了数千个它们的靶标(KD-CAGE)。数据证实NEUROD1是转录调控网络(TRN)中的关键正调节因子,而ISL1和PROX1是拮抗剂。作为一种补充方法,我们对其中四个因子进行了染色质免疫沉淀测序(ChIP-seq),以鉴定人类基因组中NEUROD1、LMX1A、PAX6和RFX6的结合位点。检查KD-CAGE实验中受干扰的基因与启动子50 kb内有ChIP-seq峰的基因之间的重叠,我们确定了这些TFs的直接转录靶标。KD-CAGE和ChIP-seq数据的整合表明,NEUROD1和LMX1A均作为主要转录激活因子发挥作用。在核心TRN(即仅TF-TF)中,NEUROD1直接转录激活胰腺TFs HSF4、INSM1、MLXIPL、MYT1、NKX6-3、ONECUT2、PAX4、PROX1、RFX6、ST18、DACH1和SHOX2,而LMX1A直接转录激活DACH1、SHOX2、PAX6和PDX1。对这些互补数据集的分析表明,在解释ChIP-seq数据集时需要谨慎。(1)很大一部分结合位点位于远端增强子位点,在没有染色质构象数据的情况下无法直接与其靶标相关联。(2)许多峰可能无功能:即使启动子处有峰,在匹配的干扰实验中基因的表达可能也不会受到影响。