基于核传播的半监督核矩阵学习

Semisupervised kernel matrix learning by kernel propagation.

作者信息

Hu Enliang, Chen Songcan, Zhang Daoqiang, Yin Xuesong

机构信息

Department of Mathematics, Yunnan Normal University, Kunming, China.

出版信息

IEEE Trans Neural Netw. 2010 Nov;21(11):1831-41. doi: 10.1109/TNN.2010.2076301. Epub 2010 Oct 4.

DOI:10.1109/TNN.2010.2076301

PMID:20923733

Abstract

The goal of semisupervised kernel matrix learning (SS-KML) is to learn a kernel matrix on all the given samples on which just a little supervised information, such as class label or pairwise constraint, is provided. Despite extensive research, the performance of SS-KML still leaves some space for improvement in terms of effectiveness and efficiency. For example, a recent pairwise constraints propagation (PCP) algorithm has formulated SS-KML into a semidefinite programming (SDP) problem, but its computation is very expensive, which undoubtedly restricts PCPs scalability in practice. In this paper, a novel algorithm, called kernel propagation (KP), is proposed to improve the comprehensive performance in SS-KML. The main idea of KP is first to learn a small-sized sub-kernel matrix (named seed-kernel matrix) and then propagate it into a larger-sized full-kernel matrix. Specifically, the implementation of KP consists of three stages: 1) separate the supervised sample (sub)set X(l) from the full sample set X; 2) learn a seed-kernel matrix on X(l) through solving a small-scale SDP problem; and 3) propagate the learnt seed-kernel matrix into a full-kernel matrix on X . Furthermore, following the idea in KP, we naturally develop two conveniently realizable out-of-sample extensions for KML: one is batch-style extension, and the other is online-style extension. The experiments demonstrate that KP is encouraging in both effectiveness and efficiency compared with three state-of-the-art algorithms and its related out-of-sample extensions are promising too.

摘要

半监督核矩阵学习（SS-KML）的目标是在所有给定样本上学习一个核矩阵，在这些样本上仅提供少量监督信息，例如类别标签或成对约束。尽管进行了广泛研究，但SS-KML的性能在有效性和效率方面仍有一定提升空间。例如，最近的成对约束传播（PCP）算法已将SS-KML表述为一个半定规划（SDP）问题，但其计算成本非常高，这无疑限制了PCP在实际中的可扩展性。本文提出了一种名为核传播（KP）的新算法，以提高SS-KML的综合性能。KP的主要思想是首先学习一个小尺寸的子核矩阵（称为种子核矩阵），然后将其传播为一个更大尺寸的全核矩阵。具体而言，KP的实现包括三个阶段：1）从全样本集X中分离出监督样本（子）集X(l)；2）通过求解一个小规模SDP问题在X(l)上学习一个种子核矩阵；3）将学习到的种子核矩阵传播为X上的全核矩阵。此外，遵循KP中的思想，我们自然地为KML开发了两种易于实现的样本外扩展：一种是批处理式扩展，另一种是在线式扩展。实验表明，与三种最先进的算法相比，KP在有效性和效率方面都令人鼓舞，并且其相关的样本外扩展也很有前景。