School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, KwaZulu-Natal, South Africa.
Department of Computer Technology, Yaba College of Technology, Lagos, Lagos State, Nigeria.
PLoS One. 2022 Aug 11;17(8):e0272861. doi: 10.1371/journal.pone.0272861. eCollection 2022.
Kmeans clustering algorithm is an iterative unsupervised learning algorithm that tries to partition the given dataset into k pre-defined distinct non-overlapping clusters where each data point belongs to only one group. However, its performance is affected by its sensitivity to the initial cluster centroids with the possibility of convergence into local optimum and specification of cluster number as the input parameter. Recently, the hybridization of metaheuristics algorithms with the K-Means algorithm has been explored to address these problems and effectively improve the algorithm's performance. Nonetheless, most metaheuristics algorithms require rigorous parameter tunning to achieve an optimum result. This paper proposes a hybrid clustering method that combines the well-known symbiotic organisms search algorithm with K-Means using the SOS as a global search metaheuristic for generating the optimum initial cluster centroids for the K-Means. The SOS algorithm is more of a parameter-free metaheuristic with excellent search quality that only requires initialising a single control parameter. The performance of the proposed algorithm is investigated by comparing it with the classical SOS, classical K-means and other existing hybrids clustering algorithms on eleven (11) UCI Machine Learning Repository datasets and one artificial dataset. The results from the extensive computational experimentation show improved performance of the hybrid SOSK-Means for solving automatic clustering compared to the standard K-Means, symbiotic organisms search clustering methods and other hybrid clustering approaches.
Kmeans 聚类算法是一种迭代的无监督学习算法,它试图将给定的数据集划分为 k 个预定义的不同的不重叠的聚类,其中每个数据点仅属于一个组。然而,它的性能受到其对初始聚类中心的敏感性的影响,有可能收敛到局部最优解,并将聚类数量作为输入参数指定。最近,元启发式算法与 K-Means 算法的混合已经被探索出来,以解决这些问题并有效地提高算法的性能。然而,大多数元启发式算法需要严格的参数调整才能达到最佳结果。本文提出了一种混合聚类方法,将著名的共生生物搜索算法与 K-Means 结合使用,将 SOS 作为一种全局搜索元启发式算法,为 K-Means 生成最佳的初始聚类中心。SOS 算法是一种更倾向于无参数的元启发式算法,具有优秀的搜索质量,只需要初始化一个单一的控制参数。通过在 11 个 UCI 机器学习库数据集和一个人工数据集上与经典的 SOS、经典的 K-Means 和其他现有的混合聚类算法进行比较,研究了所提出算法的性能。广泛的计算实验结果表明,与标准的 K-Means、共生生物搜索聚类方法和其他混合聚类方法相比,混合 SOSK-Means 在解决自动聚类问题方面具有更好的性能。