Ke Jingchen, Gong Chen, Liu Tongliang, Zhao Lin, Yang Jian, Tao Dacheng
IEEE Trans Cybern. 2022 Jan;52(1):164-177. doi: 10.1109/TCYB.2019.2953337. Epub 2022 Jan 11.
Semisupervised learning (SSL) has been widely used in numerous practical applications where the labeled training examples are inadequate while the unlabeled examples are abundant. Due to the scarcity of labeled examples, the performances of the existing SSL methods are often affected by the outliers in the labeled data, leading to the imperfect trained classifier. To enhance the robustness of SSL methods to the outliers, this article proposes a novel SSL algorithm called Laplacian Welsch regularization (LapWR). Specifically, apart from the conventional Laplacian regularizer, we also introduce a bounded, smooth, and nonconvex Welsch loss which can suppress the adverse effect brought by the labeled outliers. To handle the model nonconvexity caused by the Welsch loss, an iterative half-quadratic (HQ) optimization algorithm is adopted in which each subproblem has an ideal closed-form solution. To handle the large datasets, we further propose an accelerated model by utilizing the Nyström method to reduce the computational complexity of LapWR. Theoretically, the generalization bound of LapWR is derived based on analyzing its Rademacher complexity, which suggests that our proposed algorithm is guaranteed to obtain satisfactory performance. By comparing LapWR with the existing representative SSL algorithms on various benchmark and real-world datasets, we experimentally found that LapWR performs robustly to outliers and is able to consistently achieve the top-level results.
半监督学习(SSL)已广泛应用于许多实际应用中,在这些应用中,有标签的训练示例不足,而无标签的示例丰富。由于有标签示例的稀缺性,现有SSL方法的性能往往受到有标签数据中异常值的影响,导致训练出的分类器不完善。为了提高SSL方法对异常值的鲁棒性,本文提出了一种名为拉普拉斯韦尔施正则化(LapWR)的新型SSL算法。具体来说,除了传统的拉普拉斯正则化器外,我们还引入了一种有界、平滑且非凸的韦尔施损失,它可以抑制有标签异常值带来的不利影响。为了处理由韦尔施损失引起的模型非凸性,采用了一种迭代半二次(HQ)优化算法,其中每个子问题都有理想的闭式解。为了处理大型数据集,我们进一步利用奈斯特罗姆方法提出了一种加速模型,以降低LapWR的计算复杂度。从理论上讲,通过分析LapWR的拉德马赫复杂度推导出了它的泛化界,这表明我们提出的算法保证能获得令人满意的性能。通过在各种基准和真实世界数据集上比较LapWR与现有的代表性SSL算法,我们通过实验发现LapWR对异常值具有鲁棒性,并且能够始终如一地取得顶级结果。