Suppr超能文献

基于插值的少标签半监督学习对比学习

Interpolation-Based Contrastive Learning for Few-Label Semi-Supervised Learning.

作者信息

Yang Xihong, Hu Xiaochang, Zhou Sihang, Liu Xinwang, Zhu En

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2054-2065. doi: 10.1109/TNNLS.2022.3186512. Epub 2024 Feb 5.

Abstract

Semi-supervised learning (SSL) has long been proved to be an effective technique to construct powerful models with limited labels. In the existing literature, consistency regularization-based methods, which force the perturbed samples to have similar predictions with the original ones have attracted much attention for their promising accuracy. However, we observe that the performance of such methods decreases drastically when the labels get extremely limited, e.g., 2 or 3 labels for each category. Our empirical study finds that the main problem lies with the drift of semantic information in the procedure of data augmentation. The problem can be alleviated when enough supervision is provided. However, when little guidance is available, the incorrect regularization would mislead the network and undermine the performance of the algorithm. To tackle the problem, we: 1) propose an interpolation-based method to construct more reliable positive sample pairs and 2) design a novel contrastive loss to guide the embedding of the learned network to change linearly between samples so as to improve the discriminative capability of the network by enlarging the margin decision boundaries. Since no destructive regularization is introduced, the performance of our proposed algorithm is largely improved. Specifically, the proposed algorithm outperforms the second best algorithm (Comatch) with 5.3% by achieving 88.73% classification accuracy when only two labels are available for each class on the CIFAR-10 dataset. Moreover, we further prove the generality of the proposed method by improving the performance of the existing state-of-the-art algorithms considerably with our proposed strategy. The corresponding code is available at https://github.com/xihongyang1999/ICL_SSL.

摘要

半监督学习(SSL)长期以来一直被证明是一种利用有限标签构建强大模型的有效技术。在现有文献中,基于一致性正则化的方法,即强制扰动样本与原始样本具有相似预测结果的方法,因其有望达到的准确率而备受关注。然而,我们观察到,当标签极其有限时,例如每个类别只有2或3个标签,此类方法的性能会急剧下降。我们的实证研究发现,主要问题在于数据增强过程中语义信息的漂移。当提供足够的监督时,这个问题可以得到缓解。然而,当几乎没有指导时,不正确的正则化会误导网络并破坏算法的性能。为了解决这个问题,我们:1)提出一种基于插值的方法来构建更可靠的正样本对;2)设计一种新颖的对比损失,以引导学习到的网络的嵌入在样本之间线性变化,从而通过扩大边际决策边界来提高网络的判别能力。由于没有引入破坏性的正则化,我们提出的算法的性能得到了很大提高。具体而言,在CIFAR - 10数据集上,当每个类别只有两个标签可用时,我们提出的算法实现了88.73%的分类准确率,比第二好的算法(Comatch)高出5.3%。此外,我们通过用我们提出的策略大幅提高现有最先进算法的性能,进一步证明了所提方法的通用性。相应代码可在https://github.com/xihongyang1999/ICL_SSL获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验