Suppr超能文献

截断奇异值分解空间中微阵列表达数据的新型聚类算法。

Novel clustering algorithm for microarray expression data in a truncated SVD space.

作者信息

Horn David, Axel Inon

机构信息

School of Physics and Astronomy, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

出版信息

Bioinformatics. 2003 Jun 12;19(9):1110-5. doi: 10.1093/bioinformatics/btg053.

Abstract

MOTIVATION

This paper introduces the application of a novel clustering method to microarray expression data. Its first stage involves compression of dimensions that can be achieved by applying SVD to the gene-sample matrix in microarray problems. Thus the data (samples or genes) can be represented by vectors in a truncated space of low dimensionality, 4 and 5 in the examples studied here. We find it preferable to project all vectors onto the unit sphere before applying a clustering algorithm. The clustering algorithm used here is the quantum clustering method that has one free scale parameter. Although the method is not hierarchical, it can be modified to allow hierarchy in terms of this scale parameter.

RESULTS

We apply our method to three data sets. The results are very promising. On cancer cell data we obtain a dendrogram that reflects correct groupings of cells. In an AML/ALL data set we obtain very good clustering of samples into four classes of the data. Finally, in clustering of genes in yeast cell cycle data we obtain four groups in a problem that is estimated to contain five families.

AVAILABILITY

Software is available as Matlab programs at http://neuron.tau.ac.il/~horn/QC.htm.

摘要

动机

本文介绍了一种新型聚类方法在微阵列表达数据中的应用。其第一阶段涉及维度压缩,这可通过对微阵列问题中的基因 - 样本矩阵应用奇异值分解(SVD)来实现。这样,数据(样本或基因)就可以由低维截断空间中的向量表示,在此处研究的示例中为4维和5维。我们发现在应用聚类算法之前,将所有向量投影到单位球面上更为可取。这里使用的聚类算法是具有一个自由尺度参数的量子聚类方法。虽然该方法不是层次聚类方法,但可以针对此尺度参数进行修改以实现层次聚类。

结果

我们将我们的方法应用于三个数据集。结果非常有前景。在癌细胞数据上,我们得到了反映细胞正确分组的树状图。在急性髓细胞白血病/急性淋巴细胞白血病(AML/ALL)数据集中,我们将样本很好地聚类为数据的四类。最后,在酵母细胞周期数据的基因聚类中,我们在一个估计包含五个家族的问题中得到了四组。

可用性

可通过网址http://neuron.tau.ac.il/~horn/QC.htm以Matlab程序的形式获取软件。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验