Suppr超能文献

用于微阵列数据聚类的模糊C均值方法。

Fuzzy C-means method for clustering microarray data.

作者信息

Dembélé Doulaye, Kastner Philippe

机构信息

Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS-IMSERM-ULP, BP 10142, 67404 Illkirch Cedex, France.

出版信息

Bioinformatics. 2003 May 22;19(8):973-80. doi: 10.1093/bioinformatics/btg119.

Abstract

MOTIVATION

Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes.

RESULTS

A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster.

AVAILABILITY

Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/

摘要

动机

对DNA微阵列杂交研究的数据进行聚类分析对于识别具有生物学相关性的基因群体至关重要。诸如K均值或自组织映射等划分聚类方法将每个基因分配到单个聚类中。然而,这些方法并未提供关于给定基因对聚类整体形状影响的信息。在此,我们应用一种模糊划分方法,即模糊C均值(FCM),来为基因赋予聚类隶属度值。

结果

将FCM方法应用于微阵列数据聚类的一个主要问题是模糊度参数m的选择。我们表明,常用值m = 2并不适用于某些数据集,并且m的最优值在不同数据集之间差异很大。我们提出一种基于给定数据集中基因间距离分布的经验方法来确定m的合适值。通过为隶属度值设置阈值水平,可以选择与给定聚类紧密相关的基因。以酵母细胞周期数据集为例,我们表明这种选择增加了聚类内基因的整体生物学意义。

可用性

补充文本和Matlab函数可在http://www-igbmc.u-strasbg.fr/fcm/获取

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验