Tomancak Pavel, Berman Benjamin P, Beaton Amy, Weiszmann Richard, Kwan Elaine, Hartenstein Volker, Celniker Susan E, Rubin Gerald M
Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA.
Genome Biol. 2007;8(7):R145. doi: 10.1186/gb-2007-8-7-r145.
Cell and tissue specific gene expression is a defining feature of embryonic development in multi-cellular organisms. However, the range of gene expression patterns, the extent of the correlation of expression with function, and the classes of genes whose spatial expression are tightly regulated have been unclear due to the lack of an unbiased, genome-wide survey of gene expression patterns.
We determined and documented embryonic expression patterns for 6,003 (44%) of the 13,659 protein-coding genes identified in the Drosophila melanogaster genome with over 70,000 images and controlled vocabulary annotations. Individual expression patterns are extraordinarily diverse, but by supplementing qualitative in situ hybridization data with quantitative microarray time-course data using a hybrid clustering strategy, we identify groups of genes with similar expression. Of 4,496 genes with detectable expression in the embryo, 2,549 (57%) fall into 10 clusters representing broad expression patterns. The remaining 1,947 (43%) genes fall into 29 clusters representing restricted expression, 20% patterned as early as blastoderm, with the majority restricted to differentiated cell types, such as epithelia, nervous system, or muscle. We investigate the relationship between expression clusters and known molecular and cellular-physiological functions.
Nearly 60% of the genes with detectable expression exhibit broad patterns reflecting quantitative rather than qualitative differences between tissues. The other 40% show tissue-restricted expression; the expression patterns of over 1,500 of these genes are documented here for the first time. Within each of these categories, we identified clusters of genes associated with particular cellular and developmental functions.
细胞和组织特异性基因表达是多细胞生物胚胎发育的一个决定性特征。然而,由于缺乏对基因表达模式的无偏倚、全基因组范围的调查,基因表达模式的范围、表达与功能的相关程度以及空间表达受到严格调控的基因类别一直不清楚。
我们用超过70000张图像和受控词汇注释确定并记录了在黑腹果蝇基因组中鉴定出的13659个蛋白质编码基因中的6003个(44%)的胚胎表达模式。个体表达模式极其多样,但通过使用混合聚类策略用定量微阵列时间进程数据补充定性原位杂交数据,我们鉴定出了具有相似表达的基因群。在胚胎中具有可检测表达的4496个基因中,2549个(57%)分为10个簇,代表广泛的表达模式。其余1947个(43%)基因分为29个簇,代表受限表达,20%的基因早在囊胚期就有模式,大多数局限于分化的细胞类型,如上皮细胞、神经系统或肌肉。我们研究了表达簇与已知分子和细胞生理功能之间的关系。
近60%具有可检测表达的基因表现出广泛的模式,反映了组织之间的定量而非定性差异。另外40%表现出组织受限表达;其中1500多个基因的表达模式在此首次记录。在每一类中,我们鉴定出了与特定细胞和发育功能相关的基因簇。