Wang Jing, Geng Xin, Xue Hui
IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5445-5459. doi: 10.1109/TPAMI.2021.3082623. Epub 2022 Aug 4.
Label ambiguity has attracted quite some attention among the machine learning community. The latterly proposed Label Distribution Learning (LDL) can handle label ambiguity and has found wide applications in real classification problems. In the training phase, an LDL model is learned first. In the test phase, the top label(s) in the label distribution predicted by the learned LDL model is (are) then regarded as the predicted label(s). That is, LDL considers the whole label distribution in the training phase, but only the top label(s) in the test phase, which likely leads to objective inconsistency. To avoid such inconsistency, we propose a new LDL method Re-Weighting Large Margin Label Distribution Learning (RWLM-LDL). First, we prove that the expected L-norm loss of LDL bounds the classification error probability, and thus apply L-norm loss as the learning metric. Second, re-weighting schemes are put forward to alleviate the inconsistency. Third, large margin is introduced to further solve the inconsistency. The theoretical results are presented to showcase the generalization and discrimination of RWLM-LDL. Finally, experimental results show the statistically superior performance of RWLM-LDL against other comparing methods.
标签模糊性在机器学习社区中已引起了相当多的关注。最近提出的标签分布学习(LDL)能够处理标签模糊性,并已在实际分类问题中得到广泛应用。在训练阶段,首先学习一个LDL模型。在测试阶段,由学习到的LDL模型预测的标签分布中的顶部标签随后被视为预测标签。也就是说,LDL在训练阶段考虑整个标签分布,但在测试阶段只考虑顶部标签,这可能导致目标不一致。为避免这种不一致,我们提出了一种新的LDL方法——重新加权大间隔标签分布学习(RWLM-LDL)。首先,我们证明了LDL的期望L范数损失界定了分类错误概率,因此将L范数损失用作学习度量。其次,提出了重新加权方案以减轻不一致性。第三,引入大间隔以进一步解决不一致性。给出了理论结果以展示RWLM-LDL的泛化能力和区分能力。最后,实验结果表明RWLM-LDL相对于其他比较方法在统计上具有更优的性能。