Sharov Alexei A, Dudekula Dawood B, Ko Minoru S H
Developmental Genomics and Aging Section, Laboratory of Genetics, National Institute on Aging, National Institutes of Health 333 Cassell Drive, Suite 3000, Baltimore, MD 21224, USA.
DNA Res. 2006 Jun 30;13(3):123-34. doi: 10.1093/dnares/dsl005. Epub 2006 Sep 15.
To facilitate the analysis of gene regulatory regions of the mouse genome, we developed a CisView (http://lgsun.grc.nia.nih.gov/cisview), a browser and database of genome-wide potential transcription factor binding sites (TFBSs) that were identified using 134 position-weight matrices and 219 sequence patterns from various sources and were presented with the information about sequence conservation, neighboring genes and their structures, GO annotations, protein domains, DNA repeats and CpG islands. Analysis of the distribution of TFBSs revealed that many TFBSs (N = 145) were over-represented near transcription start sites. We also identified potential cis-regulatory modules (CRMs) defined as clusters of conserved TFBSs in the entire mouse genome. Out of 739 074 CRMs, 157 442 had a significantly higher regulatory potential score than semi-random sequences generated with a 3rd-order Markov process. The CisView browser provides a user-friendly computer environment for studying transcription regulation on a whole-genome scale and can also be used for interpreting microarray experiments and identifying putative targets of transcription factors.
为便于分析小鼠基因组的基因调控区域,我们开发了CisView(http://lgsun.grc.nia.nih.gov/cisview),这是一个浏览器和全基因组潜在转录因子结合位点(TFBS)数据库,这些位点是使用134个位置权重矩阵和来自各种来源的219个序列模式鉴定出来的,并呈现了序列保守性、相邻基因及其结构、基因本体注释、蛋白质结构域、DNA重复序列和CpG岛的信息。对TFBS分布的分析表明,许多TFBS(N = 145)在转录起始位点附近过度富集。我们还在整个小鼠基因组中鉴定出了潜在的顺式调控模块(CRM),其定义为保守TFBS的簇。在739074个CRM中,有157442个的调控潜力得分明显高于通过三阶马尔可夫过程生成的半随机序列。CisView浏览器为在全基因组范围内研究转录调控提供了一个用户友好的计算机环境,还可用于解释微阵列实验和识别转录因子的假定靶标。