Suppr超能文献

使用估计子样本大小加速模糊C均值算法

Accelerating Fuzzy-C Means Using an Estimated Subsample Size.

作者信息

Parker Jonathon K, Hall Lawrence O

机构信息

Department of Computer Science and Engineering, University of South Florida, Tampa, FL 33620, USA.

出版信息

IEEE Trans Fuzzy Syst. 2014 Oct;22(5):1229-1244. doi: 10.1109/TFUZZ.2013.2286993. Epub 2013 Oct 23.

Abstract

Many algorithms designed to accelerate the Fuzzy c-Means (FCM) clustering algorithm randomly sample the data. Typically, no statistical method is used to estimate the subsample size, despite the impact subsample sizes have on speed and quality. This paper introduces two new accelerated algorithms, GOFCM and MSERFCM, that use a statistical method to estimate the subsample size. GOFCM, a variant of SPFCM, also leverages progressive sampling. MSERFCM, a variant of rseFCM, gains a speedup from improved initialization. A general, novel stopping criterion for accelerated clustering is introduced. The new algorithms are compared to FCM and four accelerated variants of FCM. GOFCM's speedup was 4-47 times that of FCM and faster than SPFCM on each of the six datasets used in experiments. For five of the datasets, partitions were within 1% of those of FCM. MSERFCM's speedup was 5-26 times that of FCM and produced partitions within 3% of those of FCM on all datasets. A unique dataset, consisting of plankton images, exposed the strengths and weaknesses of many of the algorithms tested. It is shown that the new stopping criterion is effective in speeding up algorithms such as SPFCM and the final partitions are very close to those of FCM.

摘要

许多旨在加速模糊 c 均值(FCM)聚类算法的算法会对数据进行随机采样。通常,尽管子样本大小会对速度和质量产生影响,但并未使用统计方法来估计子样本大小。本文介绍了两种新的加速算法,即 GOFCM 和 MSERFCM,它们使用统计方法来估计子样本大小。GOFCM 是 SPFCM 的一种变体,还利用了渐进采样。MSERFCM 是 rseFCM 的一种变体,通过改进初始化实现了加速。引入了一种通用的、新颖的加速聚类停止准则。将新算法与 FCM 以及 FCM 的四种加速变体进行了比较。在实验使用的六个数据集中的每一个上,GOFCM 的加速比 FCM 快 4 至 47 倍,并且比 SPFCM 更快。对于其中五个数据集,划分结果与 FCM 的划分结果相差在 1%以内。MSERFCM 的加速比 FCM 快 5 至 26 倍,并且在所有数据集上产生的划分结果与 FCM 的划分结果相差在 3%以内。一个由浮游生物图像组成的独特数据集揭示了许多测试算法的优缺点。结果表明,新的停止准则在加速诸如 SPFCM 等算法方面是有效的,并且最终划分结果与 FCM 的划分结果非常接近。

相似文献

1
Accelerating Fuzzy-C Means Using an Estimated Subsample Size.使用估计子样本大小加速模糊C均值算法
IEEE Trans Fuzzy Syst. 2014 Oct;22(5):1229-1244. doi: 10.1109/TFUZZ.2013.2286993. Epub 2013 Oct 23.
2
Generalized fuzzy C-means clustering algorithm with improved fuzzy partitions.具有改进模糊划分的广义模糊C均值聚类算法
IEEE Trans Syst Man Cybern B Cybern. 2009 Jun;39(3):578-91. doi: 10.1109/TSMCB.2008.2004818. Epub 2009 Jan 23.

本文引用的文献

2
A Scalable Framework For Segmenting Magnetic Resonance Images.一种用于分割磁共振图像的可扩展框架。
J Signal Process Syst. 2009 Jan 1;54(1-3):183-203. doi: 10.1007/s11265-008-0243-1.
3
Complexity reduction for "large image" processing.用于“大图像”处理的复杂度降低
IEEE Trans Syst Man Cybern B Cybern. 2002;32(5):598-611. doi: 10.1109/TSMCB.2002.1033179.
4
Fuzzy c-means clustering with spatial information for image segmentation.用于图像分割的带空间信息的模糊c均值聚类
Comput Med Imaging Graph. 2006 Jan;30(1):9-15. doi: 10.1016/j.compmedimag.2005.10.001. Epub 2005 Dec 19.
5
Fuzzy C-means method for clustering microarray data.用于微阵列数据聚类的模糊C均值方法。
Bioinformatics. 2003 May 22;19(8):973-80. doi: 10.1093/bioinformatics/btg119.
6
Soft vector quantization and the EM algorithm.软矢量量化与期望最大化算法
Neural Netw. 1998 Apr;11(3):467-477. doi: 10.1016/s0893-6080(97)00147-0.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验