The Institute of Scientific and Industrial Research, Osaka University, 8-1, Mihogaoaka, Ibaraki, Osaka, Japan.
Neural Netw. 2013 Oct;46:133-43. doi: 10.1016/j.neunet.2013.05.007. Epub 2013 May 15.
The accuracy of active learning is critically influenced by the existence of noisy labels given by a noisy oracle. In this paper, we propose a novel pool-based active learning framework through robust measures based on density power divergence. By minimizing density power divergence, such as β-divergence and γ-divergence, one can estimate the model accurately even under the existence of noisy labels within data. Accordingly, we develop query selecting measures for pool-based active learning using these divergences. In addition, we propose an evaluation scheme for these measures based on asymptotic statistical analyses, which enables us to perform active learning by evaluating an estimation error directly. Experiments with benchmark datasets and real-world image datasets show that our active learning scheme performs better than several baseline methods.
主动学习的准确性受到噪声源给出的噪声标签的严重影响。在本文中,我们通过基于密度幂离差的稳健措施提出了一种新的基于池的主动学习框架。通过最小化密度幂离差,如β-散度和γ-散度,即使在数据中存在噪声标签的情况下,也可以准确地估计模型。因此,我们使用这些散度开发了基于池的主动学习的查询选择度量。此外,我们提出了一种基于渐近统计分析的这些度量的评估方案,这使我们能够通过直接评估估计误差来进行主动学习。基准数据集和真实图像数据集的实验表明,我们的主动学习方案优于几种基线方法。