Du Zhihua, Wang Yiwei, Ji Zhen
Shenzhen University, Shenzhen 518060, China.
Comput Biol Chem. 2008 Aug;32(4):243-7. doi: 10.1016/j.compbiolchem.2008.03.020. Epub 2008 May 27.
Microarray technology has been widely applied in study of measuring gene expression levels for thousands of genes simultaneously. Gene cluster analysis is found useful for discovering the function of gene because co-expressed genes are likely to share the same biological function. K-means is one of well-known clustering methods. However, it is sensitive to the selection of an initial clustering and easily becoming trapped in a local minimum. Particle-pair optimizer (PPO) is a variation on the traditional particle swarm optimization (PSO) algorithm, which is stochastic particle-pair based optimization technique that can be applied to a wide range of problems. In this paper we bridges PPO and K-means within the algorithm PK-means for the first time. Our results indicate that PK-means clustering is generally more accurate than K-means and Fuzzy K-means (FKM). PK-means also has better robustness for it is less sensitive to the initial randomly selected cluster centroids. Finally, our algorithm outperforms these methods with fast convergence rate and low computation load.
微阵列技术已被广泛应用于同时测量数千个基因的基因表达水平的研究中。基因聚类分析对于发现基因功能很有用,因为共表达的基因可能具有相同的生物学功能。K均值是一种著名的聚类方法。然而,它对初始聚类的选择很敏感,并且容易陷入局部最小值。粒子对优化器(PPO)是传统粒子群优化(PSO)算法的一种变体,它是基于随机粒子对的优化技术,可应用于广泛的问题。在本文中,我们首次在算法PK均值中将PPO和K均值结合起来。我们的结果表明,PK均值聚类通常比K均值和模糊K均值(FKM)更准确。PK均值对初始随机选择的聚类中心不太敏感,因此也具有更好的鲁棒性。最后,我们的算法以快速收敛速度和低计算量优于这些方法。