Jiang Runqing, Yan Yan, Xue Jing-Hao, Wang Biao, Wang Hanzi
IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2208-2222. doi: 10.1109/TNNLS.2022.3188799. Epub 2024 Feb 5.
Recent methods in network pruning have indicated that a dense neural network involves a sparse subnetwork (called a winning ticket), which can achieve similar test accuracy to its dense counterpart with much fewer network parameters. Generally, these methods search for the winning tickets on well-labeled data. Unfortunately, in many real-world applications, the training data are unavoidably contaminated with noisy labels, thereby leading to performance deterioration of these methods. To address the above-mentioned problem, we propose a novel two-stream sample selection network (TS3-Net), which consists of a sparse subnetwork and a dense subnetwork, to effectively identify the winning ticket with noisy labels. The training of TS3-Net contains an iterative procedure that switches between training both subnetworks and pruning the smallest magnitude weights of the sparse subnetwork. In particular, we develop a multistage learning framework including a warm-up stage, a semisupervised alternate learning stage, and a label refinement stage, to progressively train the two subnetworks. In this way, the classification capability of the sparse subnetwork can be gradually improved at a high sparsity level. Extensive experimental results on both synthetic and real-world noisy datasets (including MNIST, CIFAR-10, CIFAR-100, ANIMAL-10N, Clothing1M, and WebVision) demonstrate that our proposed method achieves state-of-the-art performance with very small memory consumption for label noise learning. Code is available at https://github.com/Runqing-forMost/TS3-Net/tree/master.
近期的网络剪枝方法表明,密集神经网络包含一个稀疏子网(称为中奖彩票子网),该子网能够以少得多的网络参数实现与密集对应网络相似的测试精度。通常,这些方法在标注良好的数据上搜索中奖彩票子网。不幸的是,在许多实际应用中,训练数据不可避免地会被噪声标签污染,从而导致这些方法的性能下降。为了解决上述问题,我们提出了一种新颖的双流样本选择网络(TS3-Net),它由一个稀疏子网和一个密集子网组成,以有效地识别带有噪声标签的中奖彩票子网。TS3-Net的训练包含一个迭代过程,该过程在训练两个子网和修剪稀疏子网中最小幅度权重之间切换。具体而言,我们开发了一个多阶段学习框架,包括热身阶段、半监督交替学习阶段和标签细化阶段,以逐步训练这两个子网。通过这种方式,稀疏子网的分类能力可以在高稀疏度水平下逐渐提高。在合成和真实世界的噪声数据集(包括MNIST、CIFAR-10、CIFAR-100、ANIMAL-10N、Clothing1M和WebVision)上的大量实验结果表明,我们提出的方法在标签噪声学习方面以非常小的内存消耗实现了领先的性能。代码可在https://github.com/Runqing-forMost/TS3-Net/tree/master获取。