Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 W 7th Ave Suite 100, Vancouver, BC, Canada.
Nucleic Acids Res. 2010 May;38(9):2990-3004. doi: 10.1093/nar/gkq003. Epub 2010 Jan 25.
The recent publication of the Caenorhabditis elegans cisRED database has provided an extensive catalog of upstream elements that are conserved between nematode genomes. We have performed a secondary analysis to determine which subsequences of the cisRED motifs are found in multiple locations throughout the C. elegans genome. We used the word-counting motif discovery algorithm DME to form the motifs into groups based on sequence similarity. We then examined the genes associated with each motif group using DAVID and Ontologizer to determine which groups are associated with genes that also have significant functional associations in the Gene Ontology and other gene annotation sources. Of the 3265 motif groups formed, 612 (19%) had significant functional associations with respect to GO terms. Eight of the first 20 motif groups based on frequent dodecamers among the cisRED motif sequences were specifically associated with ribosomal protein genes; two of these were similar to mouse EBP-45, rat HNF3-family and Drosophila Zeste transcription factor binding sites. Additionally, seven motif groups were extensions of the canonical C. elegans trans-splice acceptor site. One motif group was tested for regulatory function in a series of green fluorescent protein expression experiments and was shown to be involved in pharyngeal expression.
最近发布的秀丽隐杆线虫 cisRED 数据库提供了一个广泛的保守的线虫基因组之间的上游元件目录。我们进行了二次分析,以确定 cisRED 基序的哪些子序列存在于秀丽隐杆线虫基因组的多个位置。我们使用基于序列相似性的词计数 motif 发现算法 DME 将 motif 组合成组。然后,我们使用 DAVID 和 Ontologizer 检查与每个 motif 组相关的基因,以确定哪些组与基因本体论和其他基因注释来源中具有重要功能关联的基因相关。在所形成的 3265 个 motif 组中,有 612 个(19%)与 GO 术语具有显著的功能关联。基于 cisRED 基序序列中常见的十二聚体,前 20 个 motif 组中有 8 个与核糖体蛋白基因特别相关;其中两个与小鼠 EBP-45、大鼠 HNF3 家族和果蝇 Zeste 转录因子结合位点相似。此外,有七个 motif 组是经典秀丽隐杆线虫反式剪接受体位点的扩展。一个 motif 组在一系列绿色荧光蛋白表达实验中进行了调控功能测试,结果表明它参与了咽表达。