Suppr超能文献

STC:一种用于弱监督语义分割的从简单到复杂的框架。

STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2017 Nov;39(11):2314-2320. doi: 10.1109/TPAMI.2016.2636150. Epub 2016 Dec 6.

Abstract

Recently, significant improvement has been made on semantic object segmentation due to the development of deep convolutional neural networks (DCNNs). Training such a DCNN usually relies on a large number of images with pixel-level segmentation masks, and annotating these images is very costly in terms of both finance and human effort. In this paper, we propose a simple to complex (STC) framework in which only image-level annotations are utilized to learn DCNNs for semantic segmentation. Specifically, we first train an initial segmentation network called Initial-DCNN with the saliency maps of simple images (i.e., those with a single category of major object(s) and clean background). These saliency maps can be automatically obtained by existing bottom-up salient object detection techniques, where no supervision information is needed. Then, a better network called Enhanced-DCNN is learned with supervision from the predicted segmentation masks of simple images based on the Initial-DCNN as well as the image-level annotations. Finally, more pixel-level segmentation masks of complex images (two or more categories of objects with cluttered background), which are inferred by using Enhanced-DCNN and image-level annotations, are utilized as the supervision information to learn the Powerful-DCNN for semantic segmentation. Our method utilizes 40K simple images from Flickr.com and 10K complex images from PASCAL VOC for step-wisely boosting the segmentation network. Extensive experimental results on PASCAL VOC 2012 segmentation benchmark well demonstrate the superiority of the proposed STC framework compared with other state-of-the-arts.

摘要

由于深度卷积神经网络(DCNN)的发展,语义目标分割最近取得了重大进展。训练这样的 DCNN 通常依赖于具有像素级分割掩模的大量图像,而在财务和人力方面对这些图像进行注释非常昂贵。在本文中,我们提出了一种从简单到复杂(STC)的框架,该框架仅利用图像级注释来学习用于语义分割的 DCNN。具体来说,我们首先使用简单图像的显着图(即具有单个主要对象类别和干净背景的图像)训练称为 Initial-DCNN 的初始分割网络。这些显着图可以通过现有的自下而上的显着目标检测技术自动获得,这些技术不需要监督信息。然后,基于 Initial-DCNN 以及图像级注释,从简单图像的预测分割掩模中学习更好的网络,称为 enhanced-DCNN。最后,使用 enhanced-DCNN 和图像级注释推断出更复杂图像(具有杂乱背景的两个或更多类别对象)的更多像素级分割掩模,作为监督信息来学习用于语义分割的 Powerful-DCNN。我们的方法利用了来自 Flickr.com 的 40K 张简单图像和来自 PASCAL VOC 的 10K 张复杂图像,以逐步增强分割网络。在 PASCAL VOC 2012 分割基准上的广泛实验结果证明了所提出的 STC 框架优于其他最先进技术的优越性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验