• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于合并相似聚类的聚类有效性度量模型。

A clustering effectiveness measurement model based on merging similar clusters.

作者信息

Duan Guiqin, Zou Chensong

机构信息

School of Computer and Information Engineering, Guangdong Songshan Vocational and Technical College, Shaoguan, China.

Shaoguan Ecological and Cultural Big Data Engineering & Research Center, Shaoguan, China.

出版信息

PeerJ Comput Sci. 2024 Feb 29;10:e1863. doi: 10.7717/peerj-cs.1863. eCollection 2024.

DOI:10.7717/peerj-cs.1863
PMID:38435574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10909172/
Abstract

This article presents a clustering effectiveness measurement model based on merging similar clusters to address the problems experienced by the affinity propagation (AP) algorithm in the clustering process, such as excessive local clustering, low accuracy, and invalid clustering evaluation results that occur due to the lack of variety in some internal evaluation indices when the proportion of clusters is very high. First, depending upon the "rough clustering" process of the AP clustering algorithm, similar clusters are merged according to the relationship between the similarity between any two clusters and the average inter-cluster similarity in the entire sample set to decrease the maximum number of clusters . Then, a new scheme is proposed to calculate intra-cluster compactness, inter-cluster relative density, and inter-cluster overlap coefficient. On the basis of this new method, several internal evaluation indices based on intra-cluster cohesion and inter-cluster dispersion are designed. Results of experiments show that the proposed model can perform clustering and classification correctly and provide accurate ranges for clustering using public UCI and NSL-KDD datasets, and it is significantly superior to the three improved clustering algorithms compared with it in terms of intrusion detection indices such as detection rate and false positive rate (FPR).

摘要

本文提出了一种基于合并相似簇的聚类有效性度量模型,以解决亲和传播(AP)算法在聚类过程中遇到的问题,例如局部聚类过多、准确性低以及当簇的比例非常高时由于某些内部评估指标缺乏多样性而导致的无效聚类评估结果。首先,根据AP聚类算法的“粗聚类”过程,依据任意两个簇之间的相似度与整个样本集中簇间平均相似度的关系来合并相似簇,以减少簇的最大数量。然后,提出一种新的方案来计算簇内紧致性、簇间相对密度和簇间重叠系数。基于这种新方法,设计了几个基于簇内凝聚性和簇间离散性的内部评估指标。实验结果表明,所提出的模型能够正确地进行聚类和分类,并使用公共的UCI和NSL-KDD数据集为聚类提供准确的范围,并且在诸如检测率和误报率(FPR)等入侵检测指标方面,明显优于与之比较的三种改进聚类算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/754c2676d48e/peerj-cs-10-1863-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/982a4c21df21/peerj-cs-10-1863-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/d53cb0cdee47/peerj-cs-10-1863-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/423cb1a9e7fa/peerj-cs-10-1863-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/754c2676d48e/peerj-cs-10-1863-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/982a4c21df21/peerj-cs-10-1863-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/d53cb0cdee47/peerj-cs-10-1863-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/423cb1a9e7fa/peerj-cs-10-1863-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6019/10909172/754c2676d48e/peerj-cs-10-1863-g004.jpg

相似文献

1
A clustering effectiveness measurement model based on merging similar clusters.一种基于合并相似聚类的聚类有效性度量模型。
PeerJ Comput Sci. 2024 Feb 29;10:e1863. doi: 10.7717/peerj-cs.1863. eCollection 2024.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
A Self-Adaptive Fuzzy -Means Algorithm for Determining the Optimal Number of Clusters.一种用于确定最优聚类数的自适应模糊均值算法
Comput Intell Neurosci. 2016;2016:2647389. doi: 10.1155/2016/2647389. Epub 2016 Nov 29.
4
Metric for measuring the effectiveness of clustering of DNA microarray expression.用于测量 DNA 微阵列表达聚类有效性的度量。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-7-S2-S5.
5
A New Validity Index Based on Fuzzy Energy and Fuzzy Entropy Measures in Fuzzy Clustering Problems.基于模糊聚类问题中模糊能量和模糊熵测度的一种新有效性指标。
Entropy (Basel). 2020 Oct 23;22(11):1200. doi: 10.3390/e22111200.
6
RRH Clustering Using Affinity Propagation Algorithm with Adaptive Thresholding and Greedy Merging in Cloud Radio Access Network.基于自适应阈值和贪婪合并的亲和传播算法在云无线接入网络中的RRH聚类
Sensors (Basel). 2021 Jan 12;21(2):480. doi: 10.3390/s21020480.
7
An extended affinity propagation clustering method based on different data density types.一种基于不同数据密度类型的扩展亲和传播聚类方法。
Comput Intell Neurosci. 2015;2015:828057. doi: 10.1155/2015/828057. Epub 2015 Jan 21.
8
A parameter-free deep embedded clustering method for single-cell RNA-seq data.一种无参数深度嵌入聚类方法,用于单细胞 RNA-seq 数据。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac172.
9
Simultaneous Subspace Clustering and Cluster Number Estimating Based on Triplet Relationship.基于三元组关系的同步子空间聚类与聚类数估计
IEEE Trans Image Process. 2019 Aug;28(8):3973-3985. doi: 10.1109/TIP.2019.2903294. Epub 2019 Mar 6.
10
Density propagation based adaptive multi-density clustering algorithm.基于密度传播的自适应多密度聚类算法。
PLoS One. 2018 Jul 18;13(7):e0198948. doi: 10.1371/journal.pone.0198948. eCollection 2018.

本文引用的文献

1
RRH Clustering Using Affinity Propagation Algorithm with Adaptive Thresholding and Greedy Merging in Cloud Radio Access Network.基于自适应阈值和贪婪合并的亲和传播算法在云无线接入网络中的RRH聚类
Sensors (Basel). 2021 Jan 12;21(2):480. doi: 10.3390/s21020480.
2
Concept Drift Detection via Equal Intensity k-Means Space Partitioning.通过等强度 k-均值空间分区进行概念漂移检测。
IEEE Trans Cybern. 2021 Jun;51(6):3198-3211. doi: 10.1109/TCYB.2020.2983962. Epub 2021 May 18.
3
A cluster separation measure.一种聚类分离度量。
IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7.
4
Clustering by passing messages between data points.通过在数据点之间传递信息进行聚类。
Science. 2007 Feb 16;315(5814):972-6. doi: 10.1126/science.1136800. Epub 2007 Jan 11.
5
Statistical validation of image segmentation quality based on a spatial overlap index.基于空间重叠指数的图像分割质量的统计验证。
Acad Radiol. 2004 Feb;11(2):178-89. doi: 10.1016/s1076-6332(03)00671-8.