Alon U, Barkai N, Notterman D A, Gish K, Ybarra S, Mack D, Levine A J
Department of Molecular Biology, Princeton University, Princeton, NJ 08540, USA.
Proc Natl Acad Sci U S A. 1999 Jun 8;96(12):6745-50. doi: 10.1073/pnas.96.12.6745.
Oligonucleotide arrays can provide a broad picture of the state of the cell, by monitoring the expression level of thousands of genes at the same time. It is of interest to develop techniques for extracting useful information from the resulting data sets. Here we report the application of a two-way clustering method for analyzing a data set consisting of the expression patterns of different cell types. Gene expression in 40 tumor and 22 normal colon tissue samples was analyzed with an Affymetrix oligonucleotide array complementary to more than 6,500 human genes. An efficient two-way clustering algorithm was applied to both the genes and the tissues, revealing broad coherent patterns that suggest a high degree of organization underlying gene expression in these tissues. Coregulated families of genes clustered together, as demonstrated for the ribosomal proteins. Clustering also separated cancerous from noncancerous tissue and cell lines from in vivo tissues on the basis of subtle distributed patterns of genes even when expression of individual genes varied only slightly between the tissues. Two-way clustering thus may be of use both in classifying genes into functional groups and in classifying tissues based on gene expression.
寡核苷酸阵列能够通过同时监测数千个基因的表达水平,提供细胞状态的全面图景。开发从所得数据集中提取有用信息的技术很有意义。在此,我们报告一种双向聚类方法在分析由不同细胞类型表达模式组成的数据集时的应用。使用与6500多个人类基因互补的Affymetrix寡核苷酸阵列,分析了40个肿瘤和22个正常结肠组织样本中的基因表达。一种高效的双向聚类算法应用于基因和组织,揭示了广泛的连贯模式,表明这些组织中基因表达存在高度的组织性。如核糖体蛋白所示,共调控的基因家族聚集在一起。即使各组织间单个基因的表达仅略有差异,聚类也能根据基因的细微分布模式将癌组织与非癌组织以及细胞系与体内组织区分开来。因此,双向聚类在将基因分类为功能组以及基于基因表达对组织进行分类方面都可能有用。