Bina Minou, Wyss Phillip, Ren Wenhui, Szpankowski Wojciech, Thomas Elizabeth, Randhawa Ranjit, Reddy Sreedeepti, John Priya M, Pares-Matos Elsie I, Stein Arnold, Xu Hao, Lazarus Sheryl A
Department of Chemistry, Purdue University, West Lafayette, IN 47907, USA.
Genomics. 2004 Dec;84(6):929-40. doi: 10.1016/j.ygeno.2004.08.013.
Central to reconstruction of cis-regulatory networks is identification and classification of naturally occurring transcription factor-binding sites according to the genes that they control. We have examined salient characteristics of 9-mers that occur in various orders and combinations in the proximal promoters of human genes. In evaluations of a dataset derived with respect to experimentally defined transcription initiation sites, in some cases we observed a clear correspondence of highly ranked 9-mers with protein-binding sites in genomic DNA. Evaluations of the larger dataset, derived with respect to the 5' end of human ESTs, revealed that a subset of the highly ranked 9-mers corresponded to sites for several known transcription factor families (including CREB, ETS, EGR-1, SP1, KLF, MAZ, HIF-1, and STATs) that play important roles in the regulation of vertebrate genes. We identified several highly ranked CpG-containing 9-mers, defining sites for interactions with the CREB and ETS families of proteins, and identified potential target genes for these proteins. The results of the studies imply that the CpG-containing transcription factor-binding sites regulate the expression of genes with important roles in pathways leading to cell-type-specific gene expression and pathways controlled by the complex networks of signaling systems.
顺式调控网络重建的核心是根据其所控制的基因对天然存在的转录因子结合位点进行识别和分类。我们研究了在人类基因近端启动子中以各种顺序和组合出现的9聚体的显著特征。在对一个相对于实验确定的转录起始位点得出的数据集进行评估时,在某些情况下,我们观察到排名靠前的9聚体与基因组DNA中的蛋白质结合位点有明显的对应关系。对一个相对于人类EST 5'端得出的更大数据集的评估显示,排名靠前的9聚体的一个子集对应于几个已知转录因子家族(包括CREB、ETS、EGR-1、SP1、KLF、MAZ、HIF-1和STATs)的位点,这些家族在脊椎动物基因的调控中发挥重要作用。我们鉴定了几个排名靠前的含CpG的9聚体,确定了与CREB和ETS蛋白家族相互作用的位点,并鉴定了这些蛋白的潜在靶基因。研究结果表明,含CpG的转录因子结合位点调节在导致细胞类型特异性基因表达的途径以及由复杂信号系统网络控制的途径中起重要作用的基因的表达。