Bovine Functional Genomics Laboratory, United States Department of Agriculture, Agricultural Research Service USDA-ARS, Beltsville Agricultural Research Service, Beltsville, MD 20705, USA.
Genomics Proteomics Bioinformatics. 2013 Jun;11(3):195-8. doi: 10.1016/j.gpb.2012.10.004. Epub 2013 Feb 1.
A resource that provides candidate transcription factor binding sites (TFBSs) does not currently exist for cattle. Such data is necessary, as predicted sites may serve as excellent starting locations for future omics studies to develop transcriptional regulation hypotheses. In order to generate this resource, we employed a phylogenetic footprinting approach-using sequence conservation across cattle, human and dog-and position-specific scoring matrices to identify 379,333 putative TFBSs upstream of nearly 8000 Mammalian Gene Collection (MGC) annotated genes within the cattle genome. Comparisons of our predictions to known binding site loci within the PCK1, ACTA1 and G6PC promoter regions revealed 75% sensitivity for our method of discovery. Additionally, we intersected our predictions with known cattle SNP variants in dbSNP and on the Illumina BovineHD 770k and Bos 1 SNP chips, finding 7534, 444 and 346 overlaps, respectively. Due to our stringent filtering criteria, these results represent high quality predictions of putative TFBSs within the cattle genome. All binding site predictions are freely available at http://bfgl.anri.barc.usda.gov/BovineTFBS/ or http://199.133.54.77/BovineTFBS.
目前尚不存在针对牛的候选转录因子结合位点(TFBS)资源。此类数据是必要的,因为预测的位点可能成为未来组学研究开发转录调控假设的绝佳起点。为了生成此资源,我们采用了一种系统发育足迹法-利用牛、人类和犬类之间的序列保守性-以及位置特异性评分矩阵,鉴定了牛基因组中近 8000 个哺乳动物基因集(MGC)注释基因上游的 379333 个假定 TFBS。我们的预测与 PCK1、ACTA1 和 G6PC 启动子区域内已知的结合位点基因座的比较表明,我们的发现方法具有 75%的敏感性。此外,我们将预测结果与 dbSNP 中的已知牛 SNP 变体以及 Illumina BovineHD 770k 和 Bos 1 SNP 芯片上的 SNP 变体进行了交叉分析,分别发现了 7534、444 和 346 个重叠。由于我们采用了严格的过滤标准,因此这些结果代表了牛基因组中假定 TFBS 的高质量预测。所有结合位点预测均可在 http://bfgl.anri.barc.usda.gov/BovineTFBS/ 或 http://199.133.54.77/BovineTFBS 上免费获得。