Liu George E, Weirauch Matthew T, Van Tassell Curtis P, Li Robert W, Sonstegard Tad S, Matukumalli Lakshmi K, Connor Erin E, Hanson Richard W, Yang Jianqi
Bovine Functional Genomics Laboratory, Beltsville Agricultural Research Center, Beltsville, MD 20705, USA.
Genomics Proteomics Bioinformatics. 2008 Dec;6(3-4):129-43. doi: 10.1016/S1672-0229(09)60001-2.
A systematic phylogenetic footprinting approach was performed to identify conserved transcription factor binding sites (TFBSs) in mammalian promoter regions using human, mouse and rat sequence alignments. We found that the score distributions of most binding site models did not follow the Gaussian distribution required by many statistical methods. Therefore, we performed an empirical test to establish the optimal threshold for each model. We gauged our computational predictions by comparing with previously known TFBSs in the PCK1 gene promoter of the cytosolic isoform of phosphoenolpyruvate carboxykinase, and achieved a sensitivity of 75% and a specificity of approximately 32%. Almost all known sites overlapped with predicted sites, and several new putative TFBSs were also identified. We validated a predicted SP1 binding site in the control of PCK1 transcription using gel shift and reporter assays. Finally, we applied our computational approach to the prediction of putative TFBSs within the promoter regions of all available RefSeq genes. Our full set of TFBS predictions is freely available at http://bfgl.anri.barc.usda.gov/tfbsConsSites.
采用系统的系统发育足迹法,通过人类、小鼠和大鼠的序列比对,在哺乳动物启动子区域鉴定保守的转录因子结合位点(TFBS)。我们发现,大多数结合位点模型的得分分布并不遵循许多统计方法所要求的高斯分布。因此,我们进行了一项实证测试,以确定每个模型的最佳阈值。我们通过与磷酸烯醇丙酮酸羧激酶胞质同工型的PCK1基因启动子中先前已知的TFBS进行比较,来评估我们的计算预测,灵敏度达到75%,特异性约为32%。几乎所有已知位点都与预测位点重叠,还鉴定出了几个新的假定TFBS。我们使用凝胶迁移和报告基因检测验证了PCK1转录调控中一个预测的SP1结合位点。最后,我们将计算方法应用于所有可用RefSeq基因启动子区域内假定TFBS的预测。我们完整的TFBS预测集可在http://bfgl.anri.barc.usda.gov/tfbsConsSites免费获取。