Donaldson Ian J, Göttgens Berthold
Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 2XY, UK.
Nucleic Acids Res. 2007;35(1):e1. doi: 10.1093/nar/gkl839. Epub 2006 Nov 27.
Specificity of mammalian gene regulatory regions is achieved to a large extent through the combinatorial binding of sets of transcription factors to distinct binding sites, discrete combinations of which are often referred to as regulatory modules. Identification and subsequent characterization of gene regulatory modules will be a key step in assembling transcriptional regulatory networks from gene expression profiling data, with the ultimate goal of unravelling the regulatory codes that govern gene expression in various cell types. Here we describe the new bioinformatics tool, Composite Motif Discovery (CoMoDis), which streamlines computational identification of novel regulatory modules starting from a single seed motif. Seed motifs represent binding sites conserved across mammalian species. CoMoDis facilitates novel motif discovery by automating the extraction of DNA sequences flanking seed motifs and streamlining downstream motif discovery using a variety of tools, including several that utilize phylogenetic conservation criteria. CoMoDis is available at http://hscl.cimr.cam.ac.uk/CoMoDis_portal.html.
哺乳动物基因调控区域的特异性在很大程度上是通过转录因子组合与不同结合位点的结合来实现的,这些结合位点的离散组合通常被称为调控模块。从基因表达谱数据组装转录调控网络的关键步骤将是基因调控模块的识别及其后续特征描述,最终目标是揭示控制各种细胞类型中基因表达的调控密码。在此,我们描述了一种新的生物信息学工具——复合基序发现工具(CoMoDis),它简化了从单个种子基序开始的新型调控模块的计算识别。种子基序代表跨哺乳动物物种保守的结合位点。CoMoDis通过自动提取种子基序侧翼的DNA序列,并使用多种工具(包括几种利用系统发育保守标准的工具)简化下游基序发现,促进新型基序的发现。CoMoDis可在http://hscl.cimr.cam.ac.uk/CoMoDis_portal.html获取。