Suppr超能文献

多类学习中的部分标签损坏问题。

Multiclass Learning With Partially Corrupted Labels.

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2568-2580. doi: 10.1109/TNNLS.2017.2699783. Epub 2017 May 16.

Abstract

Traditional classification systems rely heavily on sufficient training data with accurate labels. However, the quality of the collected data depends on the labelers, among which inexperienced labelers may exist and produce unexpected labels that may degrade the performance of a learning system. In this paper, we investigate the multiclass classification problem where a certain amount of training examples are randomly labeled. Specifically, we show that this issue can be formulated as a label noise problem. To perform multiclass classification, we employ the widely used importance reweighting strategy to enable the learning on noisy data to more closely reflect the results on noise-free data. We illustrate the applicability of this strategy to any surrogate loss functions and to different classification settings. The proportion of randomly labeled examples is proved to be upper bounded and can be estimated under a mild condition. The convergence analysis ensures the consistency of the learned classifier to the optimal classifier with respect to clean data. Two instantiations of the proposed strategy are also introduced. Experiments on synthetic and real data verify that our approach yields improvements over the traditional classifiers as well as the robust classifiers. Moreover, we empirically demonstrate that the proposed strategy is effective even on asymmetrically noisy data.

摘要

传统的分类系统严重依赖于具有准确标签的充足训练数据。然而,收集数据的质量取决于标签者,其中可能存在没有经验的标签者,并产生可能降低学习系统性能的意外标签。在本文中,我们研究了多类分类问题,其中一部分训练示例是随机标记的。具体来说,我们表明,这个问题可以被表述为标签噪声问题。为了进行多类分类,我们采用了广泛使用的重要性重新加权策略,使学习嘈杂数据更紧密地反映无噪声数据的结果。我们说明了这种策略对于任何替代损失函数和不同分类设置的适用性。证明了随机标记示例的比例是有上限的,并在一个温和的条件下可以估计。收敛分析确保了学习分类器相对于清洁数据的最优分类器的一致性。还介绍了所提出策略的两个实例。在合成和真实数据上的实验验证了我们的方法在传统分类器和稳健分类器上的改进。此外,我们还通过实验证明了该策略即使在不对称噪声数据上也是有效的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验