Glöckner G, Scherer S, Schattevoy R, Boright A, Weber J, Tsui L C, Rosenthal A
Department of Genome Analysis, Institute of Molecular Biotechnology (IMB), 07745 Jena, Germany.
Genome Res. 1998 Oct;8(10):1060-73. doi: 10.1101/gr.8.10.1060.
We have sequenced and annotated two genomic regions located in the Giemsa negative band q22 of human chromosome 7. The first region defined by the erythropoietin (EPO) locus is 228 kb in length and contains 13 genes. Whereas 3 genes (GNB2, EPO, PCOLCE) were known previously on the mRNA level, we have been able to identify 10 novel genes using a newly developed automatic annotation tool RUMMAGE-DP, which comprises >26 different programs mainly for exon prediction, homology searches, and compositional and repeat analysis. For precise annotation we have also resequenced ESTs identified to the region and assembled them to build large cDNAs. In addition, we have investigated the differential splicing of genes. Using these tools we annotated 4 of the 10 genes as a zonadhesin, a transferrin homolog, a nucleoporin-like gene, and an actin gene. Two genes showed weak similarity to an insulin-like receptor and a neuronal protein with a leucine-rich amino-terminal domain. Four predicted genes (CDS1-CDS4) CDS that have been confirmed on the mRNA level showed no similarity to known proteins and a potential function could not be assigned. The second region in 7q22 defined by the CUTL1 (CCAAT displacement protein and its splice variant) locus is 416 kb in length and contains three known genes, including PMSL12, APS, CUTL1, and a novel gene (CDS5). The CUTL1 locus, consisting of two splice variants (CDP and CASP), occupies >300 kb. Based on the G, C profile an isochore switch can be defined between the CUTL1 gene and the APS and PMSL12 genes.
我们对位于人类7号染色体吉姆萨阴性带q22的两个基因组区域进行了测序和注释。由促红细胞生成素(EPO)基因座定义的第一个区域长度为228 kb,包含13个基因。虽然之前在mRNA水平上已知3个基因(GNB2、EPO、PCOLCE),但我们使用新开发的自动注释工具RUMMAGE-DP能够鉴定出10个新基因,该工具包含>26个不同程序,主要用于外显子预测、同源性搜索以及组成和重复分析。为了进行精确注释,我们还对该区域鉴定出的EST进行了重测序,并将它们组装以构建大型cDNA。此外,我们研究了基因的可变剪接。使用这些工具,我们将10个基因中的4个注释为zonadhesin、转铁蛋白同源物、核孔蛋白样基因和肌动蛋白基因。两个基因与胰岛素样受体和具有富含亮氨酸的氨基末端结构域的神经元蛋白显示出微弱的相似性。在mRNA水平上得到证实的4个预测基因(CDS1 - CDS4)与已知蛋白质没有相似性,并且无法确定其潜在功能。7q22中由CUTL1(CCAAT置换蛋白及其剪接变体)基因座定义的第二个区域长度为416 kb,包含三个已知基因,包括PMSL12、APS、CUTL1和一个新基因(CDS5)。由两个剪接变体(CDP和CASP)组成的CUTL1基因座占据>300 kb。基于G、C图谱,可以在CUTL1基因与APS和PMSL12基因之间定义一个等容线转换。