IEEE Trans Neural Netw Learn Syst. 2017 May;28(5):1123-1138. doi: 10.1109/TNNLS.2015.2511179. Epub 2016 Feb 18.
The existing, semisupervised, spectral clustering approaches have two major drawbacks, i.e., either they cannot cope with multiple categories of supervision or they sometimes exhibit unstable effectiveness. To address these issues, two normalized affinity and penalty jointly constrained spectral clustering frameworks as well as their corresponding algorithms, referred to as type-I affinity and penalty jointly constrained spectral clustering (TI-APJCSC) and type-II affinity and penalty jointly constrained spectral clustering (TII-APJCSC), respectively, are proposed in this paper. TI refers to type-I and TII to type-II. The significance of this paper is fourfold. First, benefiting from the distinctive affinity and penalty jointly constrained strategies, both TI-APJCSC and TII-APJCSC are substantially more effective than the existing methods. Second, both TI-APJCSC and TII-APJCSC are fully compatible with the three well-known categories of supervision, i.e., class labels, pairwise constraints, and grouping information. Third, owing to the delicate framework normalization, both TI-APJCSC and TII-APJCSC are quite flexible. With a simple tradeoff factor varying in the small fixed interval (0, 1], they can self-adapt to any semisupervised scenario. Finally, both TI-APJCSC and TII-APJCSC demonstrate strong robustness, not only to the number of pairwise constraints but also to the parameter for affinity measurement. As such, the novel TI-APJCSC and TII-APJCSC algorithms are very practical for medium- and small-scale semisupervised data sets. The experimental studies thoroughly evaluated and demonstrated these advantages on both synthetic and real-life semisupervised data sets.
现有的半监督谱聚类方法有两个主要的缺点,即要么不能处理多类监督,要么有时表现出不稳定的效果。为了解决这些问题,本文提出了两种归一化相似性和惩罚联合约束谱聚类框架及其相应的算法,分别称为 I 型相似性和惩罚联合约束谱聚类(TI-APJCSC)和 II 型相似性和惩罚联合约束谱聚类(TII-APJCSC)。TI 指的是 I 型,TII 指的是 II 型。本文的意义有四点。首先,受益于独特的相似性和惩罚联合约束策略,TI-APJCSC 和 TII-APJCSC 都比现有方法更有效。其次,TI-APJCSC 和 TII-APJCSC 都完全兼容三种著名的监督类别,即类别标签、成对约束和分组信息。第三,由于精细的框架归一化,TI-APJCSC 和 TII-APJCSC 非常灵活。通过在小固定区间(0,1]内变化的简单折衷因子,它们可以自适应任何半监督场景。最后,TI-APJCSC 和 TII-APJCSC 都表现出很强的鲁棒性,不仅对成对约束的数量,而且对相似性度量的参数也有很强的鲁棒性。因此,新的 TI-APJCSC 和 TII-APJCSC 算法非常适用于中等规模和小规模的半监督数据集。实验研究在合成和真实半监督数据集上全面评估和验证了这些优势。