Technical College of Engineering, Al-Bayan University, Baghdad, Iraq.
Data Analytics & AI research Group, College of Computing and Digital Technology, Faculty of Computing Engineering and the Built Environment, Birmingham City University, Birmingham, United Kingdom.
PLoS One. 2023 Jul 5;18(7):e0288044. doi: 10.1371/journal.pone.0288044. eCollection 2023.
The retrieval of important information from a dataset requires applying a special data mining technique known as data clustering (DC). DC classifies similar objects into a groups of similar characteristics. Clustering involves grouping the data around k-cluster centres that typically are selected randomly. Recently, the issues behind DC have called for a search for an alternative solution. Recently, a nature-based optimization algorithm named Black Hole Algorithm (BHA) was developed to address the several well-known optimization problems. The BHA is a metaheuristic (population-based) that mimics the event around the natural phenomena of black holes, whereby an individual star represents the potential solutions revolving around the solution space. The original BHA algorithm showed better performance compared to other algorithms when applied to a benchmark dataset, despite its poor exploration capability. Hence, this paper presents a multi-population version of BHA as a generalization of the BHA called MBHA wherein the performance of the algorithm is not dependent on the best-found solution but a set of generated best solutions. The method formulated was subjected to testing using a set of nine widespread and popular benchmark test functions. The ensuing experimental outcomes indicated the highly precise results generated by the method compared to BHA and comparable algorithms in the study, as well as excellent robustness. Furthermore, the proposed MBHA achieved a high rate of convergence on six real datasets (collected from the UCL machine learning lab), making it suitable for DC problems. Lastly, the evaluations conclusively indicated the appropriateness of the proposed algorithm to resolve DC issues.
从数据集检索重要信息需要应用一种称为数据聚类 (DC) 的特殊数据挖掘技术。DC 将相似的对象分类为具有相似特征的组。聚类涉及围绕 k-聚类中心对数据进行分组,这些聚类中心通常是随机选择的。最近,DC 背后的问题要求寻找替代解决方案。最近,开发了一种名为黑洞算法 (BHA) 的基于自然的优化算法来解决几个著名的优化问题。BHA 是一种模仿黑洞自然现象周围事件的元启发式(基于种群的)算法,其中单个恒星代表在解空间周围旋转的潜在解决方案。原始的 BHA 算法在应用于基准数据集时与其他算法相比表现出更好的性能,尽管它的探索能力较差。因此,本文提出了一种 BHA 的多群体版本,称为 MBHA,其中算法的性能不依赖于最佳发现的解决方案,而是一组生成的最佳解决方案。所提出的方法使用一组九个广泛流行的基准测试函数进行了测试。随后的实验结果表明,与 BHA 和研究中的可比算法相比,该方法生成的结果高度精确,并且具有出色的鲁棒性。此外,所提出的 MBHA 在六个真实数据集(从 UCL 机器学习实验室收集)上实现了高收敛率,使其适用于 DC 问题。最后,评估结论性地表明,所提出的算法适合解决 DC 问题。