Institute of Automation, Chinese Academy of Sciences, Beijing, China.
IEEE Trans Pattern Anal Mach Intell. 2010 Nov;32(11):2039-53. doi: 10.1109/TPAMI.2010.35.
This paper presents local spline regression for semi-supervised classification. The core idea in our approach is to introduce splines developed in Sobolev space to map the data points directly to be class labels. The spline is composed of polynomials and Green's functions. It is smooth, nonlinear, and able to interpolate the scattered data points with high accuracy. Specifically, in each neighborhood, an optimal spline is estimated via regularized least squares regression. With this spline, each of the neighboring data points is mapped to be a class label. Then, the regularized loss is evaluated and further formulated in terms of class label vector. Finally, all of the losses evaluated in local neighborhoods are accumulated together to measure the global consistency on the labeled and unlabeled data. To achieve the goal of semi-supervised classification, an objective function is constructed by combining together the global loss of the local spline regressions and the squared errors of the class labels of the labeled data. In this way, a transductive classification algorithm is developed in which a globally optimal classification can be finally obtained. In the semi-supervised learning setting, the proposed algorithm is analyzed and addressed into the Laplacian regularization framework. Comparative classification experiments on many public data sets and applications to interactive image segmentation and image matting illustrate the validity of our method.
本文提出了一种用于半监督分类的局部样条回归方法。我们方法的核心思想是引入 Sobolev 空间中的样条函数,直接将数据点映射到类别标签。该样条由多项式和格林函数组成,具有平滑、非线性的特点,能够以高精度对离散数据点进行插值。具体来说,在每个邻域内,通过正则化最小二乘回归来估计最优样条。通过该样条,将每个邻域内的数据点映射为类别标签。然后,评估正则化损失,并进一步表示为类别标签向量。最后,将所有局部邻域评估的损失累积起来,以度量有标签和无标签数据的全局一致性。为了实现半监督分类的目标,通过将局部样条回归的全局损失和有标签数据的类别标签的平方误差结合起来,构建了一个目标函数。通过这种方式,开发了一种可以最终获得全局最优分类的传导分类算法。在半监督学习设置中,我们将所提出的算法分析并纳入拉普拉斯正则化框架中。在许多公共数据集上的分类实验和交互式图像分割和图像抠图应用中,验证了我们方法的有效性。