Son Chang Gue, Bilke Sven, Davis Sean, Greer Braden T, Wei Jun S, Whiteford Craig C, Chen Qing-Rong, Cenacchi Nicola, Khan Javed
Advanced Technology Center, Oncogenomics Section, Pediatric Oncology Branch, National Cancer Institute, National Institutes of Health, Gaithersburg, Maryland 20877, USA.
Genome Res. 2005 Mar;15(3):443-50. doi: 10.1101/gr.3124505.
Genome-wide expression profiling of normal tissue may facilitate our understanding of the etiology of diseased organs and augment the development of new targeted therapeutics. Here, we have developed a high-density gene expression database of 18,927 unique genes for 158 normal human samples from 19 different organs of 30 different individuals using DNA microarrays. We report four main findings. First, despite very diverse sample parameters (e.g., age, ethnicity, sex, and postmortem interval), the expression profiles belonging to the same organs cluster together, demonstrating internal stability of the database. Second, the gene expression profiles reflect major organ-specific functions on the molecular level, indicating consistency of our database with known biology. Third, we demonstrate that any small (i.e., n approximately 100), randomly selected subset of genes can approximately reproduce the hierarchical clustering of the full data set, suggesting that the observed differential expression of >90% of the probed genes is of biological origin. Fourth, we demonstrate a potential application of this database to cancer research by identifying 19 tumor-specific genes in neuroblastoma. The selected genes are relatively underexpressed in all of the organs examined and belong to therapeutically relevant pathways, making them potential novel diagnostic markers and targets for therapy. We expect this database will be of utility for developing rationally designed molecularly targeted therapeutics in diseases such as cancer, as well as for exploring the functions of genes.
正常组织的全基因组表达谱分析有助于我们理解患病器官的病因,并推动新型靶向治疗药物的研发。在此,我们利用DNA微阵列技术,为来自30个不同个体19种不同器官的158份正常人体样本,建立了一个包含18,927个独特基因的高密度基因表达数据库。我们报告了四项主要发现。第一,尽管样本参数差异很大(如年龄、种族、性别和死后间隔时间),但来自同一器官的表达谱聚集在一起,表明该数据库具有内在稳定性。第二,基因表达谱在分子水平上反映了主要器官的特定功能,表明我们的数据库与已知生物学知识具有一致性。第三,我们证明,任何一个小的(即n约为100)、随机选择的基因子集都能大致重现完整数据集的层次聚类,这表明所观察到的90%以上被检测基因的差异表达具有生物学根源。第四,我们通过鉴定神经母细胞瘤中的19个肿瘤特异性基因,展示了该数据库在癌症研究中的潜在应用。所选基因在所有检测器官中表达相对较低,且属于与治疗相关的途径,使其成为潜在的新型诊断标志物和治疗靶点。我们期望这个数据库将有助于开发针对癌症等疾病的合理设计的分子靶向治疗药物,以及探索基因的功能。