Zhang Dingwen, Han Junwei, Zhao Long, Zhao Tao
IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5549-5560. doi: 10.1109/TNNLS.2020.2969483. Epub 2020 Nov 30.
Weakly supervised object detection (WSOD) is an interesting yet challenging task in the computer vision community. The core is to discover the image regions that contain the complete object instances under the image-level supervision. Existing works usually solve this problem via a proposal selection strategy, which selects the most discriminative box regions from the weakly labeled training images. However, these regions usually only contain the discriminative object parts rather than the complete object instances. To address this problem, this article proposes to learn a searching-agent to gradually mine desirable object regions under a region searching paradigm, where we formulate the searching process as a Markov decision process and learn the searching-agent under a deep reinforcement learning framework. To learn such a searching-agent under the weak supervision, we extract the pseudo-complete object regions and the corresponding local discriminative object parts and introduce the obtained pseudo-target-part training pairs into the reinforcement learning process of the search-agent. This learning strategy has twofold advantages: 1) it can mimic the searching process to reveal complete object regions from a certain discriminative part of the object under the weak supervision and 2) it will not suffer from the learning difficulty arise from the long-action sequence that happens when searching from the entire image range. Comprehensive experiments on benchmark data sets demonstrate that by integrating the learned searching-agent with the existing WSOD method, we can achieve better performance than the other state-of-the-art and baseline methods.
弱监督目标检测(WSOD)是计算机视觉领域中一项有趣但具有挑战性的任务。其核心在于在图像级监督下发现包含完整目标实例的图像区域。现有工作通常通过提议选择策略来解决此问题,该策略从弱标注的训练图像中选择最具判别力的框区域。然而,这些区域通常只包含有判别力的目标部分,而非完整的目标实例。为解决这个问题,本文提出学习一个搜索代理,在区域搜索范式下逐步挖掘理想的目标区域,我们将搜索过程表述为马尔可夫决策过程,并在深度强化学习框架下学习搜索代理。为在弱监督下学习这样一个搜索代理,我们提取伪完整目标区域和相应的局部有判别力的目标部分,并将得到的伪目标 - 部分训练对引入搜索代理的强化学习过程。这种学习策略有两个优点:1)它可以模仿搜索过程,在弱监督下从目标的某个有判别力的部分揭示完整的目标区域;2)它不会遭受从整个图像范围进行搜索时出现的长动作序列所带来的学习困难。在基准数据集上的综合实验表明,通过将学习到的搜索代理与现有的WSOD方法相结合,我们可以取得比其他现有最先进方法和基线方法更好的性能。