Wan Fang, Wei Pengxu, Han Zhenjun, Jiao Jianbin, Ye Qixiang
IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2395-2409. doi: 10.1109/TPAMI.2019.2898858. Epub 2019 Feb 12.
Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors. The inconsistency between the weak supervision and learning objectives introduces significant randomness to object locations and ambiguity to detectors. In this paper, a min-entropy latent model (MELM) is proposed for weakly supervised object detection. Min-entropy serves as a model to learn object locations and a metric to measure the randomness of object localization during learning. It aims to principally reduce the variance of learned instances and alleviate the ambiguity of detectors. MELM is decomposed into three components including proposal clique partition, object clique discovery, and object localization. MELM is optimized with a recurrent learning algorithm, which leverages continuation optimization to solve the challenging non-convexity problem. Experiments demonstrate that MELM significantly improves the performance of weakly supervised object detection, weakly supervised object localization, and image classification, against the state-of-the-art approaches.
当仅提供图像类别监督但同时需要学习目标位置和目标检测器时,弱监督目标检测是一项具有挑战性的任务。弱监督与学习目标之间的不一致给目标位置带来了显著的随机性,并给检测器带来了模糊性。本文提出了一种用于弱监督目标检测的最小熵潜在模型(MELM)。最小熵用作学习目标位置的模型以及衡量学习过程中目标定位随机性的度量。其主要目的是减少学习实例的方差并减轻检测器的模糊性。MELM被分解为三个组件,包括提议团划分、目标团发现和目标定位。MELM使用递归学习算法进行优化,该算法利用连续优化来解决具有挑战性的非凸性问题。实验表明,与现有方法相比,MELM显著提高了弱监督目标检测、弱监督目标定位和图像分类的性能。