• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Attend and Guide (AG-Net):一种基于关键点驱动注意力的图像识别深度网络。

Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition.

出版信息

IEEE Trans Image Process. 2021;30:3691-3704. doi: 10.1109/TIP.2021.3064256. Epub 2021 Mar 17.

DOI:10.1109/TIP.2021.3064256
PMID:33705316
Abstract

This article presents a novel keypoints-based attention mechanism for visual recognition in still images. Deep Convolutional Neural Networks (CNNs) for recognizing images with distinctive classes have shown great success, but their performance in discriminating fine-grained changes is not at the same level. We address this by proposing an end-to-end CNN model, which learns meaningful features linking fine-grained changes using our novel attention mechanism. It captures the spatial structures in images by identifying semantic regions (SRs) and their spatial distributions, and is proved to be the key to modeling subtle changes in images. We automatically identify these SRs by grouping the detected keypoints in a given image. The "usefulness" of these SRs for image recognition is measured using our innovative attentional mechanism focusing on parts of the image that are most relevant to a given task. This framework applies to traditional and fine-grained image recognition tasks and does not require manually annotated regions (e.g. bounding-box of body parts, objects, etc.) for learning and prediction. Moreover, the proposed keypoints-driven attention mechanism can be easily integrated into the existing CNN models. The framework is evaluated on six diverse benchmark datasets. The model outperforms the state-of-the-art approaches by a considerable margin using Distracted Driver V1 (Acc: 3.39%), Distracted Driver V2 (Acc: 6.58%), Stanford-40 Actions (mAP: 2.15%), People Playing Musical Instruments (mAP: 16.05%), Food-101 (Acc: 6.30%) and Caltech-256 (Acc: 2.59%) datasets.

摘要

本文提出了一种新颖的基于关键点的注意力机制,用于静态图像中的视觉识别。用于识别具有明显类别图像的深度卷积神经网络(CNN)取得了巨大的成功,但在识别细微变化方面的性能却不尽相同。我们通过提出一个端到端的 CNN 模型来解决这个问题,该模型使用我们的新注意力机制学习将细微变化联系起来的有意义的特征。它通过识别语义区域(SR)及其空间分布来捕获图像中的空间结构,并被证明是对图像中细微变化进行建模的关键。我们通过在给定图像中分组检测到的关键点来自动识别这些 SR。通过我们的创新注意力机制,测量这些 SR 对图像识别的“有用性”,该机制专注于与给定任务最相关的图像部分。该框架适用于传统和细微的图像识别任务,并且不需要手动注释区域(例如身体部位、对象等的边界框)进行学习和预测。此外,所提出的基于关键点的注意力机制可以很容易地集成到现有的 CNN 模型中。该框架在六个不同的基准数据集上进行了评估。该模型在使用分心驾驶员 V1(Acc:3.39%)、分心驾驶员 V2(Acc:6.58%)、斯坦福 40 个动作(mAP:2.15%)、人们演奏乐器(mAP:16.05%)、食物 101(Acc:6.30%)和 Caltech-256(Acc:2.59%)数据集方面,明显优于最先进的方法。

相似文献

1
Attend and Guide (AG-Net): A Keypoints-Driven Attention-Based Deep Network for Image Recognition.Attend and Guide (AG-Net):一种基于关键点驱动注意力的图像识别深度网络。
IEEE Trans Image Process. 2021;30:3691-3704. doi: 10.1109/TIP.2021.3064256. Epub 2021 Mar 17.
2
Deep Attention-Based Spatially Recursive Networks for Fine-Grained Visual Recognition.基于深度注意力的空间递归网络在细粒度视觉识别中的应用
IEEE Trans Cybern. 2019 May;49(5):1791-1802. doi: 10.1109/TCYB.2018.2813971. Epub 2018 Mar 22.
3
SR-GNN: Spatial Relation-aware Graph Neural Network for Fine-Grained Image Categorization.SR-GNN:用于细粒度图像分类的空间关系感知图神经网络
IEEE Trans Image Process. 2022 Sep 14;PP. doi: 10.1109/TIP.2022.3205215.
4
Automatically Designing CNN Architectures Using the Genetic Algorithm for Image Classification.使用遗传算法自动设计用于图像分类的 CNN 架构。
IEEE Trans Cybern. 2020 Sep;50(9):3840-3854. doi: 10.1109/TCYB.2020.2983860. Epub 2020 Apr 21.
5
Catheter segmentation in X-ray fluoroscopy using synthetic data and transfer learning with light U-nets.基于合成数据和轻量级 U 型网络的迁移学习在 X 射线透视下的导管分割
Comput Methods Programs Biomed. 2020 Aug;192:105420. doi: 10.1016/j.cmpb.2020.105420. Epub 2020 Feb 29.
6
An Ensemble of Fine-Tuned Convolutional Neural Networks for Medical Image Classification.用于医学图像分类的微调卷积神经网络集成
IEEE J Biomed Health Inform. 2017 Jan;21(1):31-40. doi: 10.1109/JBHI.2016.2635663. Epub 2016 Dec 5.
7
DENSE-INception U-net for medical image segmentation.基于密集卷积 Inception 的 U-Net 网络在医学图像分割中的应用
Comput Methods Programs Biomed. 2020 Aug;192:105395. doi: 10.1016/j.cmpb.2020.105395. Epub 2020 Feb 15.
8
Learning Rich Part Hierarchies with Progressive Attention Networks for Fine-Grained Image Recognition.利用渐进注意力网络学习丰富的部分层次结构用于细粒度图像识别
IEEE Trans Image Process. 2019 Jun 14. doi: 10.1109/TIP.2019.2921876.
9
Image generation by GAN and style transfer for agar plate image segmentation.基于 GAN 和风格迁移的琼脂平板图像分割的图像生成。
Comput Methods Programs Biomed. 2020 Feb;184:105268. doi: 10.1016/j.cmpb.2019.105268. Epub 2019 Dec 17.
10
Learning Cascade Attention for fine-grained image classification.学习级联注意力进行细粒度图像分类。
Neural Netw. 2020 Feb;122:174-182. doi: 10.1016/j.neunet.2019.10.009. Epub 2019 Oct 22.

引用本文的文献

1
Artificial intelligence technology in ophthalmology public health: current applications and future directions.眼科公共卫生中的人工智能技术:当前应用与未来方向。
Front Cell Dev Biol. 2025 Apr 17;13:1576465. doi: 10.3389/fcell.2025.1576465. eCollection 2025.
2
Interweaving Insights: High-Order Feature Interaction for Fine-Grained Visual Recognition.交织洞察:用于细粒度视觉识别的高阶特征交互
Int J Comput Vis. 2025;133(4):1755-1779. doi: 10.1007/s11263-024-02260-y. Epub 2024 Oct 20.
3
LoG-staging: a rectal cancer staging method with LoG operator based on maximization of mutual information.
LoG分期:一种基于互信息最大化的、使用LoG算子的直肠癌分期方法。
BMC Med Imaging. 2025 Mar 6;25(1):78. doi: 10.1186/s12880-025-01610-7.
4
Advances in Neuroimaging and Deep Learning for Emotion Detection: A Systematic Review of Cognitive Neuroscience and Algorithmic Innovations.用于情绪检测的神经成像与深度学习进展:认知神经科学与算法创新的系统综述
Diagnostics (Basel). 2025 Feb 13;15(4):456. doi: 10.3390/diagnostics15040456.
5
PND-Net: plant nutrition deficiency and disease classification using graph convolutional network.PND-Net:基于图卷积网络的植物营养缺乏与病害分类
Sci Rep. 2024 Jul 5;14(1):15537. doi: 10.1038/s41598-024-66543-7.
6
Attention-Based Sentiment Region Importance and Relationship Analysis for Image Sentiment Recognition.基于注意力的情感区域重要性和关系分析在图像情感识别中的应用。
Comput Intell Neurosci. 2022 Nov 17;2022:9772714. doi: 10.1155/2022/9772714. eCollection 2022.
7
Image Semantic Recognition and Segmentation Algorithm of Colorimetric Sensor Array Based on Deep Convolutional Neural Network.基于深度卷积神经网络的比色传感器阵列图像语义识别与分割算法。
Comput Intell Neurosci. 2022 Sep 30;2022:2439371. doi: 10.1155/2022/2439371. eCollection 2022.
8
Physical Education Teaching Strategy under Internet of Things Data Computing Intelligence Analysis.物联网数据计算智能分析下的体育教学策略。
Comput Intell Neurosci. 2022 Apr 11;2022:5299497. doi: 10.1155/2022/5299497. eCollection 2022.