Suppr超能文献

通过稀疏图结构学习的概率半监督学习

Probabilistic Semi-Supervised Learning via Sparse Graph Structure Learning.

作者信息

Wang Li, Chan Raymond, Zeng Tieyong

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Feb;32(2):853-867. doi: 10.1109/TNNLS.2020.2979607. Epub 2021 Feb 4.

Abstract

We present a probabilistic semi-supervised learning (SSL) framework based on sparse graph structure learning. Different from existing SSL methods with either a predefined weighted graph heuristically constructed from the input data or a learned graph based on the locally linear embedding assumption, the proposed SSL model is capable of learning a sparse weighted graph from the unlabeled high-dimensional data and a small amount of labeled data, as well as dealing with the noise of the input data. Our representation of the weighted graph is indirectly derived from a unified model of density estimation and pairwise distance preservation in terms of various distance measurements, where latent embeddings are assumed to be random variables following an unknown density function to be learned, and pairwise distances are then calculated as the expectations over the density for the model robustness to the data noise. Moreover, the labeled data based on the same distance representations are leveraged to guide the estimated density for better class separation and sparse graph structure learning. A simple inference approach for the embeddings of unlabeled data based on point estimation and kernel representation is presented. Extensive experiments on various data sets show promising results in the setting of SSL compared with many existing methods and significant improvements on small amounts of labeled data.

摘要

我们提出了一种基于稀疏图结构学习的概率半监督学习(SSL)框架。与现有的SSL方法不同,现有方法要么是根据输入数据启发式构建的预定义加权图,要么是基于局部线性嵌入假设的学习图,而所提出的SSL模型能够从未标记的高维数据和少量标记数据中学习稀疏加权图,同时还能处理输入数据的噪声。我们对加权图的表示是通过密度估计和成对距离保持的统一模型间接推导出来的,这是基于各种距离度量而言的,其中潜在嵌入被假定为遵循待学习的未知密度函数的随机变量,然后成对距离被计算为对密度的期望,以使模型对数据噪声具有鲁棒性。此外,基于相同距离表示的标记数据被用来指导估计密度,以实现更好的类别分离和稀疏图结构学习。我们提出了一种基于点估计和核表示的未标记数据嵌入的简单推理方法。在各种数据集上进行的大量实验表明,与许多现有方法相比,在半监督学习设置中取得了有前景的结果,并且在少量标记数据的情况下有显著改进。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验