Suppr超能文献

多模态课程学习在半监督图像分类中的应用。

Multi-Modal Curriculum Learning for Semi-Supervised Image Classification.

出版信息

IEEE Trans Image Process. 2016 Jul;25(7):3249-3260. doi: 10.1109/TIP.2016.2563981. Epub 2016 May 5.

Abstract

Semi-supervised image classification aims to classify a large quantity of unlabeled images by typically harnessing scarce labeled images. Existing semi-supervised methods often suffer from inadequate classification accuracy when encountering difficult yet critical images, such as outliers, because they treat all unlabeled images equally and conduct classifications in an imperfectly ordered sequence. In this paper, we employ the curriculum learning methodology by investigating the difficulty of classifying every unlabeled image. The reliability and the discriminability of these unlabeled images are particularly investigated for evaluating their difficulty. As a result, an optimized image sequence is generated during the iterative propagations, and the unlabeled images are logically classified from simple to difficult. Furthermore, since images are usually characterized by multiple visual feature descriptors, we associate each kind of features with a teacher, and design a multi-modal curriculum learning (MMCL) strategy to integrate the information from different feature modalities. In each propagation, each teacher analyzes the difficulties of the currently unlabeled images from its own modality viewpoint. A consensus is subsequently reached among all the teachers, determining the currently simplest images (i.e., a curriculum), which are to be reliably classified by the multi-modal learner. This well-organized propagation process leveraging multiple teachers and one learner enables our MMCL to outperform five state-of-the-art methods on eight popular image data sets.

摘要

半监督图像分类旨在通过利用稀缺的标记图像来对大量未标记的图像进行分类。现有的半监督方法在遇到困难但关键的图像(例如异常值)时,往往会因处理所有未标记的图像时平等对待以及进行不完全有序的分类而导致分类精度不足。在本文中,我们采用课程学习方法,通过研究对每个未标记图像进行分类的难度来实现。这些未标记图像的可靠性和可区分性被特别用于评估它们的难度。因此,在迭代传播过程中生成了一个优化的图像序列,并且将未标记的图像从简单到困难进行逻辑分类。此外,由于图像通常具有多种视觉特征描述符,我们将每个特征与一个教师相关联,并设计了一种多模态课程学习(MMCL)策略来整合来自不同特征模态的信息。在每次传播中,每个教师从自身模态角度分析当前未标记图像的难度。随后,所有教师达成共识,确定当前最简单的图像(即课程),由多模态学习者可靠地进行分类。这种利用多个教师和一个学习者的有组织的传播过程使我们的 MMCL 在八个流行的图像数据集上优于五种最先进的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验