Politecnico di Milano, Milano, Italy.
Tokyo Institute of Technology, Tokyo, Japan.
Neural Netw. 2014 Sep;57:103-11. doi: 10.1016/j.neunet.2014.05.016. Epub 2014 Jun 4.
Semi-supervised clustering aims to introduce prior knowledge in the decision process of a clustering algorithm. In this paper, we propose a novel semi-supervised clustering algorithm based on the information-maximization principle. The proposed method is an extension of a previous unsupervised information-maximization clustering algorithm based on squared-loss mutual information to effectively incorporate must-links and cannot-links. The proposed method is computationally efficient because the clustering solution can be obtained analytically via eigendecomposition. Furthermore, the proposed method allows systematic optimization of tuning parameters such as the kernel width, given the degree of belief in the must-links and cannot-links. The usefulness of the proposed method is demonstrated through experiments.
半监督聚类旨在将先验知识引入聚类算法的决策过程中。本文提出了一种基于信息最大化原理的新的半监督聚类算法。该方法是基于平方损失互信息的先前无监督信息最大化聚类算法的扩展,可以有效地合并必须链接和不能链接。由于聚类解决方案可以通过特征分解进行分析,因此该方法具有计算效率。此外,给定必须链接和不能链接的置信度程度,可以系统地优化核宽度等调整参数。通过实验证明了该方法的有效性。