Maji Pradipta, Pal Sankar K
Center for Soft Computing Research, Indian Statistical Institute, Kolkata 700 108, India.
IEEE Trans Syst Man Cybern B Cybern. 2007 Dec;37(6):1529-40. doi: 10.1109/tsmcb.2007.906578.
A generalized hybrid unsupervised learning algorithm, which is termed as rough-fuzzy possibilistic c-means (RFPCM), is proposed in this paper. It comprises a judicious integration of the principles of rough and fuzzy sets. While the concept of lower and upper approximations of rough sets deals with uncertainty, vagueness, and incompleteness in class definition, the membership function of fuzzy sets enables efficient handling of overlapping partitions. It incorporates both probabilistic and possibilistic memberships simultaneously to avoid the problems of noise sensitivity of fuzzy c-means and the coincident clusters of PCM. The concept of crisp lower bound and fuzzy boundary of a class, which is introduced in the RFPCM, enables efficient selection of cluster prototypes. The algorithm is generalized in the sense that all existing variants of c-means algorithms can be derived from the proposed algorithm as a special case. Several quantitative indices are introduced based on rough sets for the evaluation of performance of the proposed c-means algorithm. The effectiveness of the algorithm, along with a comparison with other algorithms, has been demonstrated both qualitatively and quantitatively on a set of real-life data sets.
本文提出了一种广义混合无监督学习算法,称为粗糙模糊可能性c均值(RFPCM)。它明智地整合了粗糙集和模糊集的原理。粗糙集的下近似和上近似概念处理类定义中的不确定性、模糊性和不完整性,而模糊集的隶属函数能够有效地处理重叠分区。它同时纳入了概率隶属度和可能性隶属度,以避免模糊c均值的噪声敏感性问题和可能性c均值的重合聚类问题。RFPCM中引入的类的清晰下界和模糊边界概念,能够有效地选择聚类原型。该算法具有广义性,因为c均值算法的所有现有变体都可以作为特殊情况从所提出的算法中推导出来。基于粗糙集引入了几个定量指标来评估所提出的c均值算法的性能。在一组实际数据集上,从定性和定量两方面证明了该算法的有效性以及与其他算法的比较。