Zhao Zhilin, Cao Longbing, Wang Chang-Dong
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1396-1409. doi: 10.1109/TNNLS.2023.3330475. Epub 2025 Jan 7.
The integrity of training data, even when annotated by experts, is far from guaranteed, especially for non-independent and identically distributed (non-IID) datasets comprising both in- and out-of-distribution samples. In an ideal scenario, the majority of samples would be in-distribution, while samples that deviate semantically would be identified as out-of-distribution and excluded during the annotation process. However, experts may erroneously classify these out-of-distribution samples as in-distribution, assigning them labels that are inherently unreliable. This mixture of unreliable labels and varied data types makes the task of learning robust neural networks notably challenging. We observe that both in- and out-of-distribution samples can almost invariably be ruled out from belonging to certain classes, aside from those corresponding to unreliable ground-truth labels. This opens the possibility of utilizing reliable complementary labels that indicate the classes to which a sample does not belong. Guided by this insight, we introduce a novel approach, termed gray learning (GL), which leverages both ground-truth and complementary labels. Crucially, GL adaptively adjusts the loss weights for these two label types based on prediction confidence levels. By grounding our approach in statistical learning theory, we derive bounds for the generalization error, demonstrating that GL achieves tight constraints even in non-IID settings. Extensive experimental evaluations reveal that our method significantly outperforms alternative approaches grounded in robust statistics.
即使由专家进行注释,训练数据的完整性也远不能得到保证,特别是对于包含分布内和分布外样本的非独立同分布(non-IID)数据集。在理想情况下,大多数样本将属于分布内,而语义上偏离的样本将被识别为分布外,并在注释过程中被排除。然而,专家可能会错误地将这些分布外样本分类为分布内,为它们赋予本质上不可靠的标签。这种不可靠标签和多样数据类型的混合使得学习鲁棒神经网络的任务极具挑战性。我们观察到,除了那些对应于不可靠真实标签的样本外,分布内和分布外的样本几乎总是可以被排除在某些类别之外。这就开启了利用可靠互补标签的可能性,这些标签指示样本不属于的类别。基于这一见解,我们引入了一种新颖的方法,称为灰色学习(GL),它利用了真实标签和互补标签。至关重要的是,GL会根据预测置信水平自适应地调整这两种标签类型的损失权重。通过将我们的方法建立在统计学习理论基础上,我们推导出了泛化误差的界,表明即使在非IID设置中,GL也能实现严格的约束。广泛的实验评估表明,我们的方法显著优于基于鲁棒统计的替代方法。