Sheng Weiguo, Swift Stephen, Zhang Leishi, Liu Xiaohui
Department of Information Systems and Computing, Brunel University, Uxbridge, London, UK.
IEEE Trans Syst Man Cybern B Cybern. 2005 Dec;35(6):1156-67. doi: 10.1109/tsmcb.2005.850173.
Clustering is inherently a difficult problem, both with respect to the construction of adequate objective functions as well as to the optimization of the objective functions. In this paper, we suggest an objective function called the Weighted Sum Validity Function (WSVF), which is a weighted sum of the several normalized cluster validity functions. Further, we propose a Hybrid Niching Genetic Algorithm (HNGA), which can be used for the optimization of the WSVF to automatically evolve the proper number of clusters as well as appropriate partitioning of the data set. Within the HNGA, a niching method is developed to preserve both the diversity of the population with respect to the number of clusters encoded in the individuals and the diversity of the subpopulation with the same number of clusters during the search. In addition, we hybridize the niching method with the k-means algorithm. In the experiments, we show the effectiveness of both the HNGA and the WSVF. In comparison with other related genetic clustering algorithms, the HNGA can consistently and efficiently converge to the best known optimum corresponding to the given data in concurrence with the convergence result. The WSVF is found generally able to improve the confidence of clustering solutions and achieve more accurate and robust results.
聚类本质上是一个难题,无论是在构建适当的目标函数方面,还是在目标函数的优化方面。在本文中,我们提出了一种称为加权和有效性函数(WSVF)的目标函数,它是几个归一化聚类有效性函数的加权和。此外,我们提出了一种混合小生境遗传算法(HNGA),可用于优化WSVF,以自动演化出合适的聚类数量以及数据集的适当划分。在HNGA中,开发了一种小生境方法,以在搜索过程中保持个体中编码的聚类数量方面的种群多样性以及具有相同聚类数量的子种群的多样性。此外,我们将小生境方法与k均值算法进行了混合。在实验中,我们展示了HNGA和WSVF的有效性。与其他相关的遗传聚类算法相比,HNGA能够与收敛结果一致,持续且高效地收敛到与给定数据对应的最佳已知最优解。结果发现,WSVF通常能够提高聚类解决方案的可信度,并获得更准确和稳健的结果。