Suppr超能文献

利用权重校正微阵列数据聚类分析中细胞周期同步性的损失。

Correcting the loss of cell-cycle synchrony in clustering analysis of microarray data using weights.

作者信息

Duan Fenghai, Zhang Heping

机构信息

Department of Epidemiology and Public Health, Yale University School of Medicine, New Haven, CT 06520-8034, USA.

出版信息

Bioinformatics. 2004 Jul 22;20(11):1766-71. doi: 10.1093/bioinformatics/bth169. Epub 2004 May 27.

Abstract

MOTIVATION

Due to the existence of the loss of synchrony in cell-cycle data sets, standard clustering methods (e.g. k-means), which group open reading frames (ORFs) based on similar expression levels, are deficient unless the temporal pattern of the expression levels of the ORFs is taken into account.

METHODS

We propose to improve the performance of the k-means method by assigning a decreasing weight on its variable level and evaluating the 'weighted k-means' on a yeast cell-cycle data set. Protein complexes from a public website are used as biological benchmarks. To compare the k-means clusters with the structures of the protein complexes, we measure the agreement between these two ways of clustering via the adjusted Rand index.

RESULTS

Our results show the time-decreasing weight function--exp[-(1/2)(t(2)/C(2))]--which we assign to the variable level of k-means, generally increases the agreement between protein complexes and k-means clusters when C is near the length of two cell cycles.

摘要

动机

由于细胞周期数据集中存在同步性缺失的情况,标准聚类方法(如k均值法)在基于相似表达水平对开放阅读框(ORF)进行分组时存在缺陷,除非考虑ORF表达水平的时间模式。

方法

我们建议通过对k均值法的变量水平赋予递减权重,并在酵母细胞周期数据集上评估“加权k均值法”来提高k均值法的性能。来自公共网站的蛋白质复合物用作生物学基准。为了将k均值聚类与蛋白质复合物的结构进行比较,我们通过调整后的兰德指数来衡量这两种聚类方式之间的一致性。

结果

我们的结果表明,我们赋予k均值变量水平的时间递减权重函数exp[-(1/2)(t(2)/C(2))],当C接近两个细胞周期的长度时,通常会增加蛋白质复合物与k均值聚类之间的一致性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验