超越目标提议：用于多标签图像识别的随机裁剪池化

Beyond Object Proposals: Random Crop Pooling for Multi-Label Image Recognition.

出版信息

IEEE Trans Image Process. 2016 Dec;25(12):5678-5688. doi: 10.1109/TIP.2016.2612829. Epub 2016 Sep 22.

DOI:10.1109/TIP.2016.2612829

Abstract

Learning high-level image representations using object proposals has achieved remarkable success in multi-label image recognition. However, most object proposals provide merely coarse information about the objects, and only carefully selected proposals can be helpful for boosting the performance of multi-label image recognition. In this paper, we propose an object-proposal-free framework for multi-label image recognition: random crop pooling (RCP). Basically, RCP performs stochastic scaling and cropping over images before feeding them to a standard convolutional neural network, which works quite well with a max-pooling operation for recognizing the complex contents of multi-label images. To better fit the multi-label image recognition task, we further develop a new loss function-the dynamic weighted Euclidean loss-for the training of the deep network. Our RCP approach is amazingly simple yet effective. It can achieve significantly better image recognition performance than the approaches using object proposals. Moreover, our adapted network can be easily trained in an end-to-end manner. Extensive experiments are conducted on two representative multi-label image recognition data sets (i.e., PASCAL VOC 2007 and PASCAL VOC 2012), and the results clearly demonstrate the superiority of our approach.

摘要

利用目标提议来学习高级图像表示在多标签图像识别中取得了显著成功。然而，大多数目标提议仅提供关于对象的粗略信息，只有经过精心挑选的提议才有助于提高多标签图像识别的性能。在本文中，我们提出了一种用于多标签图像识别的无目标提议框架：随机裁剪池化（RCP）。基本上，RCP在将图像输入标准卷积神经网络之前，对图像进行随机缩放和裁剪，这与用于识别多标签图像复杂内容的最大池化操作配合得很好。为了更好地适应多标签图像识别任务，我们进一步开发了一种新的损失函数——动态加权欧几里得损失——用于深度网络的训练。我们的RCP方法惊人地简单却有效。它能够比使用目标提议的方法取得显著更好的图像识别性能。此外，我们经过调整的网络可以很容易地以端到端的方式进行训练。我们在两个具有代表性的多标签图像识别数据集（即PASCAL VOC 2007和PASCAL VOC 2012）上进行了广泛的实验，结果清楚地证明了我们方法的优越性。

相似文献

Beyond Object Proposals: Random Crop Pooling for Multi-Label Image Recognition.超越目标提议：用于多标签图像识别的随机裁剪池化

IEEE Trans Image Process. 2016 Dec;25(12):5678-5688. doi: 10.1109/TIP.2016.2612829. Epub 2016 Sep 22.

HCP: A Flexible CNN Framework for Multi-label Image Classification.HCP：一种用于多标签图像分类的灵活卷积神经网络框架。

IEEE Trans Pattern Anal Mach Intell. 2016 Sep 1;38(9):1901-1907. doi: 10.1109/TPAMI.2015.2491929. Epub 2015 Oct 26.

Coarse-to-Fine Semantic Segmentation From Image-Level Labels.从图像级标签进行粗到细的语义分割。

IEEE Trans Image Process. 2020;29:225-236. doi: 10.1109/TIP.2019.2926748. Epub 2019 Jul 12.

Progressive Representation Adaptation for Weakly Supervised Object Localization.用于弱监督目标定位的渐进式表示适应

IEEE Trans Pattern Anal Mach Intell. 2020 Jun;42(6):1424-1438. doi: 10.1109/TPAMI.2019.2899839. Epub 2019 Feb 15.

Object-Location-Aware Hashing for Multi-Label Image Retrieval via Automatic Mask Learning.基于自动掩模学习的多标签图像检索的目标-位置感知哈希。

IEEE Trans Image Process. 2018 Sep;27(9):4490-4502. doi: 10.1109/TIP.2018.2839522.

Learning to Discover Multi-Class Attentional Regions for Multi-Label Image Recognition.学习发现多标签图像识别的多类注意力区域。

IEEE Trans Image Process. 2021;30:5920-5932. doi: 10.1109/TIP.2021.3088605. Epub 2021 Jun 29.

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection.PCL：用于弱监督目标检测的提议聚类学习

IEEE Trans Pattern Anal Mach Intell. 2020 Jan;42(1):176-191. doi: 10.1109/TPAMI.2018.2876304. Epub 2018 Oct 16.

Weakly Supervised Object Detection via Object-Specific Pixel Gradient.基于特定对象像素梯度的弱监督目标检测

IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):5960-5970. doi: 10.1109/TNNLS.2018.2816021. Epub 2018 Apr 9.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

Simultaneously Discovering and Localizing Common Objects in Wild Images.在野外图像中同时发现和定位常见对象。

IEEE Trans Image Process. 2018 Sep;27(9):4503-4515. doi: 10.1109/TIP.2018.2839901.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

超越目标提议：用于多标签图像识别的随机裁剪池化

Beyond Object Proposals: Random Crop Pooling for Multi-Label Image Recognition.

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献