Akalin Altuna, Fredman David, Arner Erik, Dong Xianjun, Bryne Jan Christian, Suzuki Harukazu, Daub Carsten O, Hayashizaki Yoshihide, Lenhard Boris
Computational Biology Unit, Bergen Center for Computational Science, and Sars Centre for Marine Molecular Biology, University of Bergen, 5008 Bergen, Norway.
Genome Biol. 2009;10(4):R38. doi: 10.1186/gb-2009-10-4-r38. Epub 2009 Apr 19.
Genomic regulatory blocks (GRBs) are chromosomal regions spanned by highly conserved non-coding elements (HCNEs), most of which serve as regulatory inputs of one target gene in the region. The target genes are most often transcription factors involved in embryonic development and differentiation. GRBs often contain extensive gene deserts, as well as additional 'bystander' genes intertwined with HCNEs but whose expression and function are unrelated to those of the target gene. The tight regulation of target genes, complex arrangement of regulatory inputs, and the differential responsiveness of genes in the region call for the examination of fundamental rules governing transcriptional activity in GRBs. Here we use extensive CAGE tag mapping of transcription start sites across different human tissues and differentiation stages combined with expression data and a number of sequence and epigenetic features to discover these rules and patterns.
We show evidence that GRB target genes have properties that set them apart from their bystanders as well as other genes in the genome: longer CpG islands, a higher number and wider spacing of alternative transcription start sites, and a distinct composition of transcription factor binding sites in their core/proximal promoters. Target gene expression correlates with the acetylation state of HCNEs in the region. Additionally, target gene promoters have a distinct combination of activating and repressing histone modifications in mouse embryonic stem cell lines.
GRB targets are genes with a number of unique features that are the likely cause of their ability to respond to regulatory inputs from very long distances.
基因组调控模块(GRBs)是由高度保守的非编码元件(HCNEs)所跨越的染色体区域,其中大多数作为该区域内一个靶基因的调控输入。靶基因通常是参与胚胎发育和分化的转录因子。GRBs往往包含大片的基因荒漠,以及与HCNEs交织在一起的额外“旁观者”基因,但其表达和功能与靶基因无关。靶基因的严格调控、调控输入的复杂排列以及该区域内基因的不同反应性,都需要研究GRBs中控制转录活性的基本规则。在此,我们利用跨不同人类组织和分化阶段的转录起始位点的大量CAGE标签图谱,结合表达数据以及一些序列和表观遗传特征,来发现这些规则和模式。
我们有证据表明,GRB靶基因具有一些特性,使其有别于其旁观者以及基因组中的其他基因:更长的CpG岛、更多数量和更宽间距的可变转录起始位点,以及其核心/近端启动子中转录因子结合位点的独特组成。靶基因表达与该区域内HCNEs的乙酰化状态相关。此外,在小鼠胚胎干细胞系中,靶基因启动子具有激活和抑制组蛋白修饰的独特组合。
GRB靶标是具有许多独特特征的基因,这些特征可能是它们能够对极远距离的调控输入做出反应的原因。