IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):3962-3976. doi: 10.1109/TNNLS.2019.2947789. Epub 2019 Nov 13.
Sample balancing includes sample selection and sample reweighting. Sample selection aims to remove some bad samples that may lead to bad local optima. Sample reweighting aims to assign optimal weights to samples to improve performance. In this article, we integrate a sample selection method based on self-paced learning into deep learning frameworks and study the influence of different sample selection strategies on training deep networks. In addition, most of the existing sample reweighting methods mainly take per-class sample number as a metric, which does not fully consider sample qualities. To improve the performance, we propose a novel metric based on the multiview semantic encoders to reweight the samples more appropriately. Then, we propose an optimization mechanism to embed sample weights into loss functions of deep networks, which can be trained in end-to-end manners. We conduct experiments on the CIFAR data set and the ImageNet data set. The experimental results demonstrate that our proposed sample balancing method can improve the performances of deep learning methods in several visual recognition tasks.
样本平衡包括样本选择和样本重加权。样本选择旨在去除可能导致局部最优解不佳的一些不良样本。样本重加权旨在为样本分配最佳权重以提高性能。在本文中,我们将基于自步学习的样本选择方法集成到深度学习框架中,并研究不同的样本选择策略对训练深度网络的影响。此外,大多数现有的样本重加权方法主要以每类样本数量作为度量,而没有充分考虑样本质量。为了提高性能,我们提出了一种新的基于多视图语义编码器的度量标准,以更适当地对样本进行重加权。然后,我们提出了一种优化机制,将样本权重嵌入到深度网络的损失函数中,可以端到端地进行训练。我们在 CIFAR 数据集和 ImageNet 数据集上进行了实验。实验结果表明,我们提出的样本平衡方法可以提高深度学习方法在几个视觉识别任务中的性能。