Suppr超能文献

验证基因表达数据的聚类分析

Validating clustering for gene expression data.

作者信息

Yeung K Y, Haynor D R, Ruzzo W L

机构信息

Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195, USA.

出版信息

Bioinformatics. 2001 Apr;17(4):309-18. doi: 10.1093/bioinformatics/17.4.309.

Abstract

MOTIVATION

Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance.

RESULTS

We successfully applied our methodology to compare six clustering algorithms on four gene expression data sets. We found our quantitative measures of cluster quality to be positively correlated with external standards of cluster quality.

摘要

动机

已经提出了许多聚类算法用于基因表达数据分析,但在选择这些算法方面几乎没有可用的指导。我们提供了一个用于评估聚类算法结果的系统框架。聚类算法试图将基因划分为在表达水平上呈现相似变化模式的组。我们的方法是将一种聚类算法应用于除一个实验条件之外的所有数据。其余条件用于评估所得聚类的预测能力——有意义的聚类在其余条件下应比随机形成的聚类表现出更小的变化。

结果

我们成功应用我们的方法在四个基因表达数据集上比较了六种聚类算法。我们发现我们对聚类质量的定量度量与聚类质量的外部标准呈正相关。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验