基于锐化空间注意力的轻量级 YOLO 模型的一种时空锐化注意力机制。

One Spatio-Temporal Sharpening Attention Mechanism for Light-Weight YOLO Models Based on Sharpening Spatial Attention.

机构信息

School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China.

HDU-ITMO Joint Institute, Hangzhou Dianzi University, Hangzhou 310018, China.

出版信息

Sensors (Basel). 2021 Nov 28;21(23):7949. doi: 10.3390/s21237949.

DOI:10.3390/s21237949

PMID:34883953

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8659721/

Abstract

Attention mechanisms have demonstrated great potential in improving the performance of deep convolutional neural networks (CNNs). However, many existing methods dedicate to developing channel or spatial attention modules for CNNs with lots of parameters, and complex attention modules inevitably affect the performance of CNNs. During our experiments of embedding Convolutional Block Attention Module (CBAM) in light-weight model YOLOv5s, CBAM does influence the speed and increase model complexity while reduce the average precision, but Squeeze-and-Excitation (SE) has a positive impact in the model as part of CBAM. To replace the spatial attention module in CBAM and offer a suitable scheme of channel and spatial attention modules, this paper proposes one Spatio-temporal Sharpening Attention Mechanism (SSAM), which sequentially infers intermediate maps along channel attention module and Sharpening Spatial Attention (SSA) module. By introducing sharpening filter in spatial attention module, we propose SSA module with low complexity. We try to find a scheme to combine our SSA module with SE module or Efficient Channel Attention (ECA) module and show best improvement in models such as YOLOv5s and YOLOv3-tiny. Therefore, we perform various replacement experiments and offer one best scheme that is to embed channel attention modules in backbone and neck of the model and integrate SSAM into YOLO head. We verify the positive effect of our SSAM on two general object detection datasets VOC2012 and MS COCO2017. One for obtaining a suitable scheme and the other for proving the versatility of our method in complex scenes. Experimental results on the two datasets show obvious promotion in terms of average precision and detection performance, which demonstrates the usefulness of our SSAM in light-weight YOLO models. Furthermore, visualization results also show the advantage of enhancing positioning ability with our SSAM.

摘要

注意力机制在提高深度卷积神经网络 (CNN) 的性能方面显示出了巨大的潜力。然而，许多现有的方法致力于为具有大量参数的 CNN 开发通道或空间注意力模块，而复杂的注意力模块不可避免地会影响 CNN 的性能。在我们将卷积块注意力模块 (CBAM) 嵌入轻量级模型 YOLOv5s 的实验中，CBAM 确实会影响速度并增加模型的复杂性，同时降低平均精度，但作为 CBAM 的一部分，挤压激励 (SE) 对模型有积极的影响。为了替代 CBAM 中的空间注意力模块，并提供一种合适的通道和空间注意力模块方案，本文提出了一种时空锐化注意力机制 (SSAM)，它沿着通道注意力模块和锐化空间注意力 (SSA) 模块依次推断中间图。通过在空间注意力模块中引入锐化滤波器，我们提出了具有低复杂度的 SSA 模块。我们尝试找到一种将我们的 SSA 模块与 SE 模块或高效通道注意力 (ECA) 模块相结合的方案，并在 YOLOv5s 和 YOLOv3-tiny 等模型中取得最佳改进。因此，我们进行了各种替换实验，并提供了一种最佳方案，即将通道注意力模块嵌入模型的骨干和颈部，并将 SSAM 集成到 YOLO 头部。我们在两个通用目标检测数据集 VOC2012 和 MS COCO2017 上验证了我们的 SSAM 的积极效果。一个用于获得合适的方案，另一个用于证明我们的方法在复杂场景中的通用性。在这两个数据集上的实验结果表明，在平均精度和检测性能方面都有明显的提升，这证明了我们的 SSAM 在轻量级 YOLO 模型中的有用性。此外，可视化结果还显示了我们的 SSAM 增强定位能力的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c079/8659721/de24b47b77fa/sensors-21-07949-g001.jpg

相似文献

One Spatio-Temporal Sharpening Attention Mechanism for Light-Weight YOLO Models Based on Sharpening Spatial Attention.

Sensors (Basel). 2021 Nov 28;21(23):7949. doi: 10.3390/s21237949.

Enhanced object detection in pediatric bronchoscopy images using YOLO-based algorithms with CBAM attention mechanism.

Heliyon. 2024 Jun 17;10(12):e32678. doi: 10.1016/j.heliyon.2024.e32678. eCollection 2024 Jun 30.

Research of Maritime Object Detection Method in Foggy Environment Based on Improved Model SRC-YOLO.

Sensors (Basel). 2022 Oct 13;22(20):7786. doi: 10.3390/s22207786.

Improved YOLOv4-tiny based on attention mechanism for skin detection.

PeerJ Comput Sci. 2023 Mar 10;9:e1288. doi: 10.7717/peerj-cs.1288. eCollection 2023.

Improved deep CNNs based on Nonlinear Hybrid Attention Module for image classification.

Neural Netw. 2021 Aug;140:158-166. doi: 10.1016/j.neunet.2021.01.005. Epub 2021 Feb 12.

Workshop Safety Helmet Wearing Detection Model Based on SCM-YOLO.

Sensors (Basel). 2022 Sep 5;22(17):6702. doi: 10.3390/s22176702.

Research on Safety Helmet Detection Algorithm Based on Improved YOLOv5s.

Sensors (Basel). 2023 Jun 22;23(13):5824. doi: 10.3390/s23135824.

YOLO-SCL: a lightweight detection model for citrus psyllid based on spatial channel interaction.

Front Plant Sci. 2023 Oct 27;14:1276833. doi: 10.3389/fpls.2023.1276833. eCollection 2023.

[A lightweight multiscale target object detection network for melanoma based on attention mechanism manipulation].

Nan Fang Yi Ke Da Xue Xue Bao. 2022 Nov 20;42(11):1662-1671. doi: 10.12122/j.issn.1673-4254.2022.11.10.

YOLOv5s-CA: A Modified YOLOv5s Network with Coordinate Attention for Underwater Target Detection.

Sensors (Basel). 2023 Mar 23;23(7):3367. doi: 10.3390/s23073367.

引用本文的文献

Research on enhancing road apparent crack detection based on the improved YOLOv8n model.

PLoS One. 2025 Sep 4;20(9):e0330218. doi: 10.1371/journal.pone.0330218. eCollection 2025.

CropPhenoX: high-throughput automatic extraction system for wheat seedling phenotypic traits based on software and hardware collaboration.

Front Plant Sci. 2025 Aug 7;16:1650229. doi: 10.3389/fpls.2025.1650229. eCollection 2025.

MMG-Based Motion Segmentation and Recognition of Upper Limb Rehabilitation Using the YOLOv5s-SE.

Sensors (Basel). 2025 Apr 3;25(7):2257. doi: 10.3390/s25072257.

Early detection of verticillium wilt in eggplant leaves by fusing five image channels: a deep learning approach.

Plant Methods. 2024 Nov 15;20(1):173. doi: 10.1186/s13007-024-01291-3.

YOLOV5-CBAM-C3TR: an optimized model based on transformer module and attention mechanism for apple leaf disease detection.

Front Plant Sci. 2024 Jan 15;14:1323301. doi: 10.3389/fpls.2023.1323301. eCollection 2023.

Enhanced batch sorting and rapid sensory analysis of Mackerel products using YOLOv5s algorithm and CBAM: Validation through TPA, colorimeter, and PLSR analysis.

Food Chem X. 2023 Jun 1;19:100733. doi: 10.1016/j.fochx.2023.100733. eCollection 2023 Oct 30.

Sugarcane-Seed-Cutting System Based on Machine Vision in Pre-Seed Mode.

Sensors (Basel). 2022 Nov 2;22(21):8430. doi: 10.3390/s22218430.

A Thermal Infrared Pedestrian-Detection Method for Edge Computing Devices.

Sensors (Basel). 2022 Sep 5;22(17):6710. doi: 10.3390/s22176710.

本文引用的文献

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于锐化空间注意力的轻量级 YOLO 模型的一种时空锐化注意力机制。

One Spatio-Temporal Sharpening Attention Mechanism for Light-Weight YOLO Models Based on Sharpening Spatial Attention.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献