Boardman Paul E, Sanz-Ezquerro Juan, Overton Ian M, Burt David W, Bosch Elizabeth, Fong Willy T, Tickle Cheryll, Brown William R A, Wilson Stuart A, Hubbard Simon J
Department of Biomolecular Sciences, University of Manchester Institute of Science and Technology, P.O. Box 88, M60 1QD, Manchester, United Kingdom.
Curr Biol. 2002 Nov 19;12(22):1965-9. doi: 10.1016/s0960-9822(02)01296-4.
Birds have played a central role in many biological disciplines, particularly ecology, evolution, and behavior. The chicken, as a model vertebrate, also represents an important experimental system for developmental biologists, immunologists, cell biologists, and geneticists. However, genomic resources for the chicken have lagged behind those for other model organisms, with only 1845 nonredundant full-length chicken cDNA sequences currently deposited in the EMBL databank. We describe a large-scale expressed-sequence-tag (EST) project aimed at gene discovery in chickens (http://www.chick.umist.ac.uk). In total, 339,314 ESTs have been sequenced from 64 cDNA libraries generated from 21 different embryonic and adult tissues. These were clustered and assembled into 85,486 contiguous sequences (contigs). We find that a minimum of 38% of the contigs have orthologs in other organisms and define an upper limit of 13,000 new chicken genes. The remaining contigs may include novel avian specific or rapidly evolving genes. Comparison of the contigs with known chicken genes and orthologs indicates that 30% include cDNAs that contain the start codon and 20% of the contigs represent full-length cDNA sequences. Using this dataset, we estimate that chickens have approximately 35,000 genes in total, suggesting that this number may be a characteristic feature of vertebrates.
鸟类在许多生物学学科中都发挥了核心作用,尤其是在生态学、进化和行为学领域。鸡作为一种模式脊椎动物,对于发育生物学家、免疫学家、细胞生物学家和遗传学家而言,也是一个重要的实验系统。然而,鸡的基因组资源一直落后于其他模式生物,目前在欧洲分子生物学实验室数据库(EMBL)中仅存有1845条非冗余的全长鸡cDNA序列。我们描述了一个旨在发现鸡基因的大规模表达序列标签(EST)项目(http://www.chick.umist.ac.uk)。总共从21种不同胚胎和成年组织构建的64个cDNA文库中测序了339,314条EST。这些EST被聚类并组装成85,486个连续序列(重叠群)。我们发现至少38%的重叠群在其他生物中有直系同源物,并确定新鸡基因的上限为13,000个。其余的重叠群可能包含新的鸟类特异性或快速进化的基因。将重叠群与已知的鸡基因和直系同源物进行比较表明,30%的重叠群包含含有起始密码子的cDNA,20%的重叠群代表全长cDNA序列。利用这个数据集,我们估计鸡总共约有35,000个基因,这表明这个数字可能是脊椎动物的一个特征。