Abe Hideaki, Gemmell Neil J
Department of Anatomy, University of Otago, Dunedin, New Zealand.
BMC Genomics. 2014 Oct 15;15(1):900. doi: 10.1186/1471-2164-15-900.
Eukaryotic promoters are regions containing various sequence motifs necessary to control gene transcription. Much evidence has emerged showing that structural and/or contextual changes in regulatory elements can critically affect cis-regulatory activity. As sequence motifs can be key factors in maintaining complex promoter architectures, one effective approach to further understand the evolution of promoter regions in vertebrates is to compare the abundance and distribution patterns of sequence motifs in these regions between divergent species. When compared with mammals, the chicken (Gallus gallus) has a very different genome composition and sufficient genomic information to make it a good model for the exploration of promoter structure and evolution.
More than 10% of chicken genes contained short tandem repeat (STR) in the region 2 kb upstream of promoters, but the total number of STRs observed in chicken is approximately half of that detected in human promoters. In terms of the STR motif frequencies, chicken promoter regions were more similar to other avian and mammalian promoters than these were to the entire chicken genome. Unlike other STRs, nearly half of the trinucleotide repeats found in promoters partly or entirely overlapped with CpG islands, indicating potential association with nucleosome positions. Moreover, the chicken promoters are abundant with sequence motifs such as poly-A, poly-G and G-quadruplexes, especially in the core region, that are otherwise rare in the genome. Most of sequence motifs showed strong functional enrichment for particular gene ontology (GO) categories, indicating roles in regulation of transcription and gene expression, as well as immune response and cognition.
Chicken promoter regions share some, but not all, of the structural features observed in mammalian promoters. The findings presented here provide empirical evidence suggesting that the frequencies and locations of STR motifs have been conserved through promoter evolution in a lineage-specific manner. Correlation analysis between GO categories and sequence motifs suggests motif-specific constraints acting on gene function.
真核生物启动子是包含控制基因转录所需各种序列基序的区域。越来越多的证据表明,调控元件的结构和/或上下文变化会严重影响顺式调控活性。由于序列基序可能是维持复杂启动子结构的关键因素,进一步了解脊椎动物启动子区域进化的一种有效方法是比较不同物种这些区域中序列基序的丰度和分布模式。与哺乳动物相比,鸡(原鸡)具有非常不同的基因组组成和足够的基因组信息,使其成为探索启动子结构和进化的良好模型。
超过10%的鸡基因在启动子上游2 kb区域含有短串联重复序列(STR),但在鸡中观察到的STR总数约为人类启动子中检测到的STR总数的一半。就STR基序频率而言,鸡启动子区域与其他鸟类和哺乳动物启动子比与整个鸡基因组更相似。与其他STR不同,在启动子中发现的近一半三核苷酸重复序列部分或完全与CpG岛重叠,表明与核小体位置存在潜在关联。此外,鸡启动子富含多聚A、多聚G和G-四链体等序列基序,尤其是在核心区域,而这些在基因组中其他地方很少见。大多数序列基序在特定基因本体(GO)类别中显示出强烈的功能富集,表明在转录和基因表达调控以及免疫反应和认知中发挥作用。
鸡启动子区域具有一些但并非所有在哺乳动物启动子中观察到的结构特征。本文的研究结果提供了经验证据,表明STR基序的频率和位置通过启动子进化以谱系特异性方式得以保留。GO类别与序列基序之间的相关性分析表明基序对基因功能有特异性限制。