• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HCP:一种用于多标签图像分类的灵活卷积神经网络框架。

HCP: A Flexible CNN Framework for Multi-label Image Classification.

作者信息

Wei Yunchao, Xia Wei, Lin Min, Huang Junshi, Ni Bingbing, Dong Jian, Zhao Yao, Yan Shuicheng

出版信息

IEEE Trans Pattern Anal Mach Intell. 2016 Sep 1;38(9):1901-1907. doi: 10.1109/TPAMI.2015.2491929. Epub 2015 Oct 26.

DOI:10.1109/TPAMI.2015.2491929
PMID:26513778
Abstract

Convolutional Neural Network (CNN) has demonstrated promising performance in single-label image classification tasks. However, how CNN best copes with multi-label images still remains an open problem, mainly due to the complex underlying object layouts and insufficient multi-label training images. In this work, we propose a flexible deep CNN infrastructure, called Hypotheses-CNN-Pooling (HCP), where an arbitrary number of object segment hypotheses are taken as the inputs, then a shared CNN is connected with each hypothesis, and finally the CNN output results from different hypotheses are aggregated with max pooling to produce the ultimate multi-label predictions. Some unique characteristics of this flexible deep CNN infrastructure include: 1) no ground-truth bounding box information is required for training; 2) the whole HCP infrastructure is robust to possibly noisy and/or redundant hypotheses; 3) the shared CNN is flexible and can be well pre-trained with a large-scale single-label image dataset, e.g., ImageNet; and 4) it may naturally output multi-label prediction results. Experimental results on Pascal VOC 2007 and VOC 2012 multi-label image datasets well demonstrate the superiority of the proposed HCP infrastructure over other state-of-the-arts. In particular, the mAP reaches 90.5% by HCP only and 93.2% after the fusion with our complementary result in [44] based on hand-crafted features on the VOC 2012 dataset.

摘要

卷积神经网络(CNN)在单标签图像分类任务中已展现出良好的性能。然而,CNN如何最佳地处理多标签图像仍是一个悬而未决的问题,主要原因在于复杂的底层物体布局以及多标签训练图像不足。在这项工作中,我们提出了一种灵活的深度CNN架构,称为假设-CNN-池化(HCP),其中任意数量的物体分割假设被用作输入,然后一个共享的CNN与每个假设相连,最后不同假设的CNN输出结果通过最大池化进行聚合,以产生最终的多标签预测。这种灵活的深度CNN架构的一些独特特性包括:1)训练时无需真实边界框信息;2)整个HCP架构对可能有噪声和/或冗余的假设具有鲁棒性;3)共享的CNN灵活且可以使用大规模单标签图像数据集(例如ImageNet)进行良好的预训练;4)它可以自然地输出多标签预测结果。在Pascal VOC 2007和VOC 2012多标签图像数据集上的实验结果很好地证明了所提出的HCP架构优于其他现有技术。特别是,仅通过HCP,在VOC 2012数据集上的平均精度均值(mAP)达到90.5%,与我们基于手工特征在[44]中的互补结果融合后达到93.2%。

相似文献

1
HCP: A Flexible CNN Framework for Multi-label Image Classification.HCP:一种用于多标签图像分类的灵活卷积神经网络框架。
IEEE Trans Pattern Anal Mach Intell. 2016 Sep 1;38(9):1901-1907. doi: 10.1109/TPAMI.2015.2491929. Epub 2015 Oct 26.
2
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.
3
S-CNN: Subcategory-Aware Convolutional Networks for Object Detection.S-CNN:用于目标检测的子类别感知卷积网络
IEEE Trans Pattern Anal Mach Intell. 2018 Oct;40(10):2522-2528. doi: 10.1109/TPAMI.2017.2756936. Epub 2017 Sep 26.
4
Beyond Object Proposals: Random Crop Pooling for Multi-Label Image Recognition.超越目标提议:用于多标签图像识别的随机裁剪池化
IEEE Trans Image Process. 2016 Dec;25(12):5678-5688. doi: 10.1109/TIP.2016.2612829. Epub 2016 Sep 22.
5
Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection.边缘保持和多尺度上下文神经网络的显著目标检测。
IEEE Trans Image Process. 2018;27(1):121-134. doi: 10.1109/TIP.2017.2756825.
6
A Deep Multi-Modal CNN for Multi-Instance Multi-Label Image Classification.一种用于多实例多标签图像分类的深度多模态 CNN。
IEEE Trans Image Process. 2018 Dec;27(12):6025-6038. doi: 10.1109/TIP.2018.2864920. Epub 2018 Aug 10.
7
Taxonomy of multi-focal nematode image stacks by a CNN based image fusion approach.基于卷积神经网络的图像融合方法对多焦点线虫图像堆栈的分类。
Comput Methods Programs Biomed. 2018 Mar;156:209-215. doi: 10.1016/j.cmpb.2018.01.016. Epub 2018 Jan 11.
8
Human Parsing with Contextualized Convolutional Neural Network.基于上下文卷积神经网络的人体解析
IEEE Trans Pattern Anal Mach Intell. 2017 Jan;39(1):115-127. doi: 10.1109/TPAMI.2016.2537339. Epub 2016 Mar 2.
9
Deep Convolutional Neural Network for Mapping Smallholder Agriculture Using High Spatial Resolution Satellite Image.用于利用高空间分辨率卫星图像绘制小农户农业地图的深度卷积神经网络
Sensors (Basel). 2019 May 25;19(10):2398. doi: 10.3390/s19102398.
10
Diverse Region-Based CNN for Hyperspectral Image Classification.基于多样化区域的卷积神经网络在高光谱图像分类中的应用。
IEEE Trans Image Process. 2018 Jun;27(6):2623-2634. doi: 10.1109/TIP.2018.2809606.

引用本文的文献

1
Tackling over-smoothing in multi-label image classification using graphical convolution neural network.使用图形卷积神经网络解决多标签图像分类中的过度平滑问题。
Evol Syst (Berl). 2022 Sep 7:1-11. doi: 10.1007/s12530-022-09463-z.
2
A Deep Learning On-Board Health Monitoring Method for Landing Gear Shock-Absorbing Systems.一种用于起落架减震系统的深度学习机载健康监测方法。
Sensors (Basel). 2025 Apr 27;25(9):2767. doi: 10.3390/s25092767.
3
Remote intelligent perception system for multi-object detection.用于多目标检测的远程智能感知系统
Front Neurorobot. 2024 May 20;18:1398703. doi: 10.3389/fnbot.2024.1398703. eCollection 2024.
4
Detection of activities in bathrooms through deep learning and environmental data graphics images.通过深度学习和环境数据图形图像检测浴室中的活动。
Heliyon. 2024 Feb 28;10(6):e26942. doi: 10.1016/j.heliyon.2024.e26942. eCollection 2024 Mar 30.
5
A model-driven approach for fast modeling of three-dimensional laser point cloud in large substation.一种用于大型变电站三维激光点云快速建模的模型驱动方法。
Sci Rep. 2023 Sep 26;13(1):16092. doi: 10.1038/s41598-023-42401-w.
6
Dual spin max pooling convolutional neural network for solar cell crack detection.双自旋最大池卷积神经网络在太阳能电池裂纹检测中的应用。
Sci Rep. 2023 Jul 9;13(1):11099. doi: 10.1038/s41598-023-38177-8.
7
Review on Functional Testing Scenario Library Generation for Connected and Automated Vehicles.面向互联和自动驾驶车辆的功能测试场景库生成研究综述
Sensors (Basel). 2022 Oct 12;22(20):7735. doi: 10.3390/s22207735.
8
CNN-LSTM Facial Expression Recognition Method Fused with Two-Layer Attention Mechanism.基于两层注意力机制融合的 CNN-LSTM 人脸表情识别方法。
Comput Intell Neurosci. 2022 Oct 13;2022:7450637. doi: 10.1155/2022/7450637. eCollection 2022.
9
Interactive Medical Image Labeling Tool to Construct a Robust Convolutional Neural Network Training Data Set: Development and Validation Study.用于构建强大卷积神经网络训练数据集的交互式医学图像标注工具:开发与验证研究
JMIR Med Inform. 2022 Aug 22;10(8):e37284. doi: 10.2196/37284.
10
S-MAT: Semantic-Driven Masked Attention Transformer for Multi-Label Aerial Image Classification.S-MAT:用于多标签航空图像分类的语义驱动掩蔽注意力转换器。
Sensors (Basel). 2022 Jul 20;22(14):5433. doi: 10.3390/s22145433.