Bonaldo Maria F, Bair Thomas B, Scheetz Todd E, Snir Einat, Akabogu Ike, Bair Jennifer L, Berger Brian, Crouch Keith, Davis Aja, Eyestone Mari E, Keppel Catherine, Kucaba Tamara A, Lebeck Mark, Lin Jenny L, de Melo Anna I R, Rehmann Joshua, Reiter Rebecca S, Schaefer Kelly, Smith Christina, Tack Dylan, Trout Kurtis, Sheffield Val C, Lin Jim J-C, Casavant Thomas L, Soares Marcelo B
Department of Pediatrics, The University of Iowa, Iowa City, Iowa 52242, USA.
Genome Res. 2004 Oct;14(10B):2053-63. doi: 10.1101/gr.2601304.
As part of the trans-National Institutes of Health (NIH) Mouse Brain Molecular Anatomy Project (BMAP), and in close coordination with the NIH Mammalian Gene Collection Program (MGC), we initiated a large-scale project to clone, identify, and sequence the complete open reading frame (ORF) of transcripts expressed in the developing mouse nervous system. Here we report the analysis of the ORF sequence of 1274 cDNAs, obtained from 47 full-length-enriched cDNA libraries, constructed by using a novel approach, herein described. cDNA libraries were derived from size-fractionated cytoplasmic mRNA isolated from brain and eye tissues obtained at several embryonic stages and postnatal days. Altogether, including the full-ORF MGC sequences derived from these libraries by the MGC sequencing team, NIH_BMAP full-ORF sequences correspond to approximately 20% of all transcripts currently represented in mouse MGC. We show that NIH_BMAP clones comprise 68% of mouse MGC cDNAs > or =5 kb, and 54% of those > or =4 kb, as of March 15, 2004. Importantly, we identified transcripts, among the 1274 full-ORF sequences, that are exclusively or predominantly expressed in brain and eye tissues, many of which encode yet uncharacterized proteins.
作为美国国立卫生研究院(NIH)跨机构小鼠脑分子解剖计划(BMAP)的一部分,并与NIH哺乳动物基因收集计划(MGC)密切协作,我们启动了一个大规模项目,旨在克隆、鉴定并测序在发育中的小鼠神经系统中表达的转录本的完整开放阅读框(ORF)。在此,我们报告了对1274个cDNA的ORF序列的分析结果,这些cDNA来自47个全长富集cDNA文库,这些文库采用了本文所述的新方法构建。cDNA文库来源于从多个胚胎阶段和出生后不同天数获取的脑和眼组织中分离出的经大小分级的细胞质mRNA。截至2004年3月15日,包括MGC测序团队从这些文库中获得的全长ORF MGC序列在内,NIH_BMAP全长ORF序列约占小鼠MGC中目前所代表的所有转录本的20%。我们发现,NIH_BMAP克隆占2004年3月15日时小鼠MGC中长度大于或等于5 kb的cDNA的68%,以及长度大于或等于4 kb的cDNA的54%。重要的是,我们在这1274个全长ORF序列中鉴定出了仅在脑和眼组织中特异性表达或主要在脑和眼组织中表达的转录本,其中许多转录本编码尚未鉴定的蛋白质。