Suppr超能文献

半监督线性判别聚类。

Semi-supervised linear discriminant clustering.

出版信息

IEEE Trans Cybern. 2014 Jul;44(7):989-1000. doi: 10.1109/TCYB.2013.2278466. Epub 2013 Aug 27.

Abstract

This paper devises a semi-supervised learning method called semi-supervised linear discriminant clustering (Semi-LDC). The proposed algorithm considers clustering and dimensionality reduction simultaneously by connecting K -means and linear discriminant analysis (LDA). The goal is to find a feature space where the K -means can perform well in the new space. To exploit the information brought by unlabeled examples, this paper proposes to use soft labels to denote the labels of unlabeled examples. The Semi-LDC uses the proposed algorithm, called constrained-PLSA, to estimate the soft labels of unlabeled examples. We use soft LDA with hard labels of labeled examples and soft labels of unlabeled examples to find a projection matrix. The clustering is then performed in the new feature space. We conduct experiments on three data sets. The experimental results indicate that the proposed method can generally outperform other semi-supervised methods. We further discuss and analyze the influence of soft labels on classification performance by conducting experiments with different percentages of labeled examples. The finding shows that using soft labels can improve performance particularly when the number of available labeled examples is insufficient to train a robust and accurate model. Additionally, the proposed method can be viewed as a framework, since different soft label estimation methods can be used in the proposed method according to application requirements.

摘要

本文提出了一种称为半监督线性判别聚类(Semi-LDC)的半监督学习方法。该算法通过连接 K-均值和线性判别分析(LDA),同时考虑聚类和降维。目标是找到一个特征空间,在这个新空间中 K-均值可以很好地执行。为了利用未标记示例带来的信息,本文提出使用软标签来表示未标记示例的标签。Semi-LDC 使用称为受限-PLSA 的提议算法来估计未标记示例的软标签。我们使用带有硬标签的有标记示例和带有软标签的未标记示例的软 LDA 来找到一个投影矩阵。然后在新的特征空间中进行聚类。我们在三个数据集上进行了实验。实验结果表明,所提出的方法通常可以优于其他半监督方法。我们通过使用不同比例的标记示例进行实验进一步讨论和分析软标签对分类性能的影响。结果表明,使用软标签可以提高性能,尤其是在可用标记示例数量不足以训练稳健和准确的模型时。此外,所提出的方法可以看作是一个框架,因为根据应用需求,可以在提议的方法中使用不同的软标签估计方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验