John R M, Robbins C A, Myers R M
Department of Genetics, School of Medicine, Stanford University, CA 94305-5120.
Hum Mol Genet. 1994 Sep;3(9):1611-6. doi: 10.1093/hmg/3.9.1611.
We combined the isolation of gene-enriched genomic DNA with gene prediction by computer to search for genes in a cosmid contig covering one million base pairs in the Huntington disease region on chromosome 4. Our aim was to develop a simple, robust strategy to identify genes adjacent to CpG islands without first characterizing undermethylated regions with multiple rare-cutter restriction enzyme sites. We cloned DNA adjacent to the rare-cutter restriction enzyme sites EagI and SacII, which are predicted to cut more frequently within CpG islands and relied solely on minimal sequence analysis to determine the likely coding potential of the DNA next to these sites. Our results indicated that isolating fragments with a single rare-cutter restriction enzyme site was sufficient to provide a high likelihood of identifying genes. Of the 42 CpG-selected clones analyzed, we determined that 17 contained exons as determined by sequence identity to known genes in this region, sequence identity to gene fragments isolated by direct cDNA selection in our laboratory, and/or their ability to detect transcripts on Northern blots. Analysis of the sequences with the BLAST and GRAIL programs provided additional independent evidence that 15 of these 17 clones contain coding sequences and that nine other clones are likely to contain sequences coding for portions of new genes. By mapping these clones to an EcoRI restriction map of the region, we determined a detailed localization for each of the exons and estimate that there are a minimum of seven genes that contain CpG-rich DNA between D4S126 and D4S181.
我们将富含基因的基因组DNA分离与计算机基因预测相结合,以在覆盖4号染色体上亨廷顿病区域100万个碱基对的黏粒重叠群中寻找基因。我们的目标是开发一种简单、可靠的策略,无需先用多种稀有切割限制酶位点来表征低甲基化区域,就能鉴定与CpG岛相邻的基因。我们克隆了与稀有切割限制酶位点EagI和SacII相邻的DNA,预计这两种酶在CpG岛内切割频率更高,并且仅依靠最少的序列分析来确定这些位点旁DNA的可能编码潜力。我们的结果表明,分离具有单个稀有切割限制酶位点的片段足以提供识别基因的高可能性。在分析的42个CpG选择克隆中,我们通过与该区域已知基因的序列同一性、与我们实验室通过直接cDNA选择分离的基因片段的序列同一性和/或它们在Northern印迹上检测转录本的能力,确定其中17个含有外显子。用BLAST和GRAIL程序对序列进行分析提供了额外的独立证据,表明这17个克隆中有15个含有编码序列,另外9个克隆可能含有新基因部分的编码序列。通过将这些克隆定位到该区域的EcoRI限制酶切图谱上,我们确定了每个外显子的详细定位,并估计在D4S126和D4S181之间至少有7个含有富含CpG DNA的基因。