IEEE Trans Image Process. 2014 Mar;23(3):1430-41. doi: 10.1109/TIP.2014.2302675.
Supervised machine learning techniques have been applied to multilabel image classification problems with tremendous success. Despite disparate learning mechanisms, their performances heavily rely on the quality of training images. However, the acquisition of training images requires significant efforts from human annotators. This hinders the applications of supervised learning techniques to large scale problems. In this paper, we propose a high-order label correlation driven active learning (HoAL) approach that allows the iterative learning algorithm itself to select the informative example-label pairs from which it learns so as to learn an accurate classifier with less annotation efforts. Four crucial issues are considered by the proposed HoAL: 1) unlike binary cases, the selection granularity for multilabel active learning need to be fined from example to example-label pair; 2) different labels are seldom independent, and label correlations provide critical information for efficient learning; 3) in addition to pair-wise label correlations, high-order label correlations are also informative for multilabel active learning; and 4) since the number of label combinations increases exponentially with respect to the number of labels, an efficient mining method is required to discover informative label correlations. The proposed approach is tested on public data sets, and the empirical results demonstrate its effectiveness.
监督机器学习技术已成功应用于多标签图像分类问题。尽管学习机制不同,但它们的性能严重依赖于训练图像的质量。然而,训练图像的获取需要人类注释者付出巨大的努力。这阻碍了监督学习技术在大规模问题中的应用。在本文中,我们提出了一种高阶标签相关驱动的主动学习(HoAL)方法,该方法允许迭代学习算法本身从其中学习的信息示例-标签对中选择信息示例-标签对,以便用较少的注释工作学习准确的分类器。所提出的 HoAL 考虑了四个关键问题:1)与二进制情况不同,多标签主动学习的选择粒度需要从示例到示例-标签对进行细化;2)不同的标签很少是独立的,标签相关性为有效学习提供了关键信息;3)除了两两标签相关性之外,高阶标签相关性对于多标签主动学习也很有意义;4)由于标签组合的数量随着标签数量呈指数增长,因此需要一种有效的挖掘方法来发现信息性标签相关性。该方法在公共数据集上进行了测试,实验结果证明了其有效性。