Zhang Q H, Ye M, Wu X Y, Ren S X, Zhao M, Zhao C J, Fu G, Shen Y, Fan H Y, Lu G, Zhong M, Xu X R, Han Z G, Zhang J W, Tao J, Huang Q H, Zhou J, Hu G X, Gu J, Chen S J, Chen Z
Shanghai Institute of Hematology (SIH), Rui Jin Hospital affiliated with Shanghai Second Medical University, Shanghai 200025, China.
Genome Res. 2000 Oct;10(10):1546-60. doi: 10.1101/gr.140200.
Three hundred cDNAs containing putatively entire open reading frames (ORFs) for previously undefined genes were obtained from CD34+ hematopoietic stem/progenitor cells (HSPCs), based on EST cataloging, clone sequencing, in silico cloning, and rapid amplification of cDNA ends (RACE). The cDNA sizes ranged from 360 to 3496 bp and their ORFs coded for peptides of 58-752 amino acids. Public database search indicated that 225 cDNAs exhibited sequence similarities to genes identified across a variety of species. Homology analysis led to the recognition of 50 basic structural motifs/domains among these cDNAs. Genomic exon-intron organization could be established in 243 genes by integration of cDNA data with genome sequence information. Interestingly, a new gene named as HSPC070 on 3p was found to share a sequence of 105bp in 3' UTR with RAF gene in reversed transcription orientation. Chromosomal localizations were obtained using electronic mapping for 192 genes and with radiation hybrid (RH) for 38 genes. Macroarray technique was applied to screen the gene expression patterns in five hematopoietic cell lines (NB4, HL60, U937, K562, and Jurkat) and a number of genes with differential expression were found. The resource work has provided a wide range of information useful not only for expression genomics and annotation of genomic DNA sequence, but also for further research on the function of genes involved in hematopoietic development and differentiation.
基于EST编目、克隆测序、电子克隆和cDNA末端快速扩增(RACE)技术,从CD34+造血干/祖细胞(HSPCs)中获得了300个包含推定的完整开放阅读框(ORF)的cDNA,这些cDNA对应于以前未定义的基因。cDNA大小范围为360至3496 bp,其ORF编码58 - 752个氨基酸的肽段。公共数据库搜索表明,225个cDNA与多种物种中鉴定出的基因具有序列相似性。同源性分析导致在这些cDNA中识别出50个基本结构基序/结构域。通过将cDNA数据与基因组序列信息整合,可以确定243个基因的基因组外显子-内含子组织。有趣的是,发现位于3p上的一个名为HSPC070的新基因在3'UTR中与RAF基因有105bp的序列共享,但转录方向相反。使用电子定位技术确定了192个基因的染色体定位,使用辐射杂种(RH)技术确定了38个基因的染色体定位。应用宏阵列技术筛选了五种造血细胞系(NB4、HL60、U937、K562和Jurkat)中的基因表达模式,发现了一些差异表达的基因。这项资源工作提供了广泛的信息,不仅对表达基因组学和基因组DNA序列注释有用,而且对造血发育和分化相关基因的功能进一步研究也有用。