• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用共生生物搜索算法增强 k-均值聚类算法以解决自动聚类问题。

Boosting k-means clustering with symbiotic organisms search for automatic clustering problems.

机构信息

School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, Pietermaritzburg, KwaZulu-Natal, South Africa.

Department of Computer Technology, Yaba College of Technology, Lagos, Lagos State, Nigeria.

出版信息

PLoS One. 2022 Aug 11;17(8):e0272861. doi: 10.1371/journal.pone.0272861. eCollection 2022.

DOI:10.1371/journal.pone.0272861
PMID:35951672
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9371361/
Abstract

Kmeans clustering algorithm is an iterative unsupervised learning algorithm that tries to partition the given dataset into k pre-defined distinct non-overlapping clusters where each data point belongs to only one group. However, its performance is affected by its sensitivity to the initial cluster centroids with the possibility of convergence into local optimum and specification of cluster number as the input parameter. Recently, the hybridization of metaheuristics algorithms with the K-Means algorithm has been explored to address these problems and effectively improve the algorithm's performance. Nonetheless, most metaheuristics algorithms require rigorous parameter tunning to achieve an optimum result. This paper proposes a hybrid clustering method that combines the well-known symbiotic organisms search algorithm with K-Means using the SOS as a global search metaheuristic for generating the optimum initial cluster centroids for the K-Means. The SOS algorithm is more of a parameter-free metaheuristic with excellent search quality that only requires initialising a single control parameter. The performance of the proposed algorithm is investigated by comparing it with the classical SOS, classical K-means and other existing hybrids clustering algorithms on eleven (11) UCI Machine Learning Repository datasets and one artificial dataset. The results from the extensive computational experimentation show improved performance of the hybrid SOSK-Means for solving automatic clustering compared to the standard K-Means, symbiotic organisms search clustering methods and other hybrid clustering approaches.

摘要

Kmeans 聚类算法是一种迭代的无监督学习算法,它试图将给定的数据集划分为 k 个预定义的不同的不重叠的聚类,其中每个数据点仅属于一个组。然而,它的性能受到其对初始聚类中心的敏感性的影响,有可能收敛到局部最优解,并将聚类数量作为输入参数指定。最近,元启发式算法与 K-Means 算法的混合已经被探索出来,以解决这些问题并有效地提高算法的性能。然而,大多数元启发式算法需要严格的参数调整才能达到最佳结果。本文提出了一种混合聚类方法,将著名的共生生物搜索算法与 K-Means 结合使用,将 SOS 作为一种全局搜索元启发式算法,为 K-Means 生成最佳的初始聚类中心。SOS 算法是一种更倾向于无参数的元启发式算法,具有优秀的搜索质量,只需要初始化一个单一的控制参数。通过在 11 个 UCI 机器学习库数据集和一个人工数据集上与经典的 SOS、经典的 K-Means 和其他现有的混合聚类算法进行比较,研究了所提出算法的性能。广泛的计算实验结果表明,与标准的 K-Means、共生生物搜索聚类方法和其他混合聚类方法相比,混合 SOSK-Means 在解决自动聚类问题方面具有更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/eb8e08621fe6/pone.0272861.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/cea3ae16bbf1/pone.0272861.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/df7cb6705072/pone.0272861.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/b1330ba35d93/pone.0272861.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/eb8e08621fe6/pone.0272861.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/cea3ae16bbf1/pone.0272861.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/df7cb6705072/pone.0272861.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/b1330ba35d93/pone.0272861.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2d/9371361/eb8e08621fe6/pone.0272861.g004.jpg

相似文献

1
Boosting k-means clustering with symbiotic organisms search for automatic clustering problems.利用共生生物搜索算法增强 k-均值聚类算法以解决自动聚类问题。
PLoS One. 2022 Aug 11;17(8):e0272861. doi: 10.1371/journal.pone.0272861. eCollection 2022.
2
Does Determination of Initial Cluster Centroids Improve the Performance of -Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study.初始聚类质心的确定是否能提高 -Means 聚类算法的性能?在应用研究中,通过遗传算法、最小生成树和层次聚类三种混合方法的比较。
Comput Math Methods Med. 2020 Aug 1;2020:7636857. doi: 10.1155/2020/7636857. eCollection 2020.
3
A novel Chinese herbal medicine clustering algorithm via artificial bee colony optimization.一种基于人工蜂群优化的中草药聚类算法。
Artif Intell Med. 2019 Nov;101:101760. doi: 10.1016/j.artmed.2019.101760. Epub 2019 Nov 10.
4
An Innovative Excited-ACS-IDGWO Algorithm for Optimal Biomedical Data Feature Selection.一种创新的基于激发 ACS-IDGWO 算法的最优生物医学数据特征选择方法。
Biomed Res Int. 2020 Aug 17;2020:8506365. doi: 10.1155/2020/8506365. eCollection 2020.
5
A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics.一种用于生物信息学中不完整数据集的多核密度聚类算法。
BMC Syst Biol. 2018 Nov 22;12(Suppl 6):111. doi: 10.1186/s12918-018-0630-6.
6
An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.一种增强型确定性 K-Means 聚类算法,用于从基因表达数据中预测癌症亚型。
Comput Biol Med. 2017 Dec 1;91:213-221. doi: 10.1016/j.compbiomed.2017.10.014. Epub 2017 Oct 23.
7
In simulated data and health records, latent class analysis was the optimum multimorbidity clustering algorithm.在模拟数据和健康记录中,潜在类别分析是最优的多病种聚类算法。
J Clin Epidemiol. 2022 Dec;152:164-175. doi: 10.1016/j.jclinepi.2022.10.011. Epub 2022 Oct 11.
8
Augmented weighted K-means grey wolf optimizer: An enhanced metaheuristic algorithm for data clustering problems.增强加权K均值灰狼优化算法:一种用于数据聚类问题的增强型元启发式算法。
Sci Rep. 2024 Mar 5;14(1):5434. doi: 10.1038/s41598-024-55619-z.
9
A hybrid monkey search algorithm for clustering analysis.一种用于聚类分析的混合猴子搜索算法。
ScientificWorldJournal. 2014 Mar 4;2014:938239. doi: 10.1155/2014/938239. eCollection 2014.
10
Towards enhancement of performance of K-means clustering using nature-inspired optimization algorithms.利用自然启发式优化算法提升K均值聚类性能的研究
ScientificWorldJournal. 2014;2014:564829. doi: 10.1155/2014/564829. Epub 2014 Aug 18.

引用本文的文献

1
Benchmarking validity indices for evolutionary K-means clustering performance.用于进化K均值聚类性能的基准有效性指标
Sci Rep. 2025 Jul 1;15(1):21842. doi: 10.1038/s41598-025-08473-6.
2
Traffic safety evaluation of emerging mixed traffic flow at freeway merging area considering driving behavior.考虑驾驶行为的高速公路合流区新兴混合交通流交通安全评价
Sci Rep. 2025 Mar 28;15(1):10686. doi: 10.1038/s41598-025-94658-y.
3
A step-by-step protocol based on data mining to explore purinergic signaling in glioblastoma.一种基于数据挖掘的逐步方案,用于探索胶质母细胞瘤中的嘌呤能信号传导。

本文引用的文献

1
Clustering Algorithms: Their Application to Gene Expression Data.聚类算法:它们在基因表达数据中的应用。
Bioinform Biol Insights. 2016 Nov 30;10:237-253. doi: 10.4137/BBI.S38316. eCollection 2016.
2
Hybrid Symbiotic Organisms Search Optimization Algorithm for Scheduling of Tasks on Cloud Computing Environment.云计算环境下任务调度的混合共生生物搜索优化算法
PLoS One. 2016 Jun 27;11(6):e0158229. doi: 10.1371/journal.pone.0158229. eCollection 2016.
3
A cluster separation measure.一种聚类分离度量。
Purinergic Signal. 2025 Mar 12. doi: 10.1007/s11302-025-10080-z.
4
Development of a machine learning-based predictive model for maxillary sinus cysts and exploration of clustering patterns.基于机器学习的上颌窦囊肿预测模型的开发及聚类模式探索。
Head Face Med. 2025 Mar 12;21(1):17. doi: 10.1186/s13005-025-00492-y.
5
Cluster validity indices for automatic clustering: A comprehensive review.用于自动聚类的聚类有效性指标:全面综述。
Heliyon. 2025 Jan 15;11(2):e41953. doi: 10.1016/j.heliyon.2025.e41953. eCollection 2025 Jan 30.
6
Codon Bias of the Gene and Transcription Factor EHF in Multiple Species.多个物种中基因和转录因子 EHF 的密码子偏好性。
Int J Mol Sci. 2024 Oct 4;25(19):10696. doi: 10.3390/ijms251910696.
7
Association between estimated glucose disposal rate control level and stroke incidence in middle-aged and elderly adults.估算葡萄糖处置率控制水平与中老年人群卒中发病的相关性。
J Diabetes. 2024 Aug;16(8):e13595. doi: 10.1111/1753-0407.13595.
8
Unveiling Allosteric Regulation and Binding Mechanism of BRD9 through Molecular Dynamics Simulations and Markov Modeling.通过分子动力学模拟和马尔可夫建模揭示 BRD9 的别构调节和结合机制。
Molecules. 2024 Jul 25;29(15):3496. doi: 10.3390/molecules29153496.
9
Research on load clustering algorithm based on variational autoencoder and hierarchical clustering.基于变分自编码器和层次聚类的负荷聚类算法研究
PLoS One. 2024 Jun 13;19(6):e0303977. doi: 10.1371/journal.pone.0303977. eCollection 2024.
10
Molecular Dynamics Simulations Combined with Markov Model to Explore the Effect of Allosteric Inhibitor Binding on Bromodomain-Containing Protein 4.分子动力学模拟结合马尔可夫模型探索变构抑制剂结合对包含溴结构域蛋白 4 的影响。
Int J Mol Sci. 2023 Jun 29;24(13):10831. doi: 10.3390/ijms241310831.
IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7.
4
FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data.FLAME,一种用于分析DNA微阵列数据的新型模糊聚类方法。
BMC Bioinformatics. 2007 Jan 4;8:3. doi: 10.1186/1471-2105-8-3.
5
Hybrid genetic algorithms for feature selection.用于特征选择的混合遗传算法
IEEE Trans Pattern Anal Mach Intell. 2004 Nov;26(11):1424-37. doi: 10.1109/TPAMI.2004.105.