School of Communication, Beijing Information Science and Technology University, Beijing, China.
Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing, China.
PLoS One. 2024 Feb 13;19(2):e0296992. doi: 10.1371/journal.pone.0296992. eCollection 2024.
The current challenges in Synthetic Aperture Radar (SAR) ship detection tasks revolve around handling significant variations in target sizes and managing high computational expenses, which hinder practical deployment on satellite or mobile airborne platforms. In response to these challenges, this research presents YOLOv7-LDS, a lightweight yet highly accurate SAR ship detection model built upon the YOLOv7 framework. In the core of YOLOv7-LDS's architecture, we introduce a streamlined feature extraction network that strikes a delicate balance between detection precision and computational efficiency. This network is founded on Shufflenetv2 and incorporates Squeeze-and-Excitation (SE) attention mechanisms as its key elements. Additionally, in the Neck section, we introduce the Weighted Efficient Aggregation Network (DCW-ELAN), a fundamental feature extraction module that leverages Coordinate Attention (CA) and Depthwise Convolution (DWConv). This module efficiently aggregates features while preserving the ability to identify small-scale variations, ensuring top-quality feature extraction. Furthermore, we introduce a lightweight Spatial Pyramid Dilated Convolution Cross-Stage Partial Channel (LSPHDCCSPC) module. LSPHDCCSPC is a condensed version of the Spatial Pyramid Pooling Cross-Stage Partial Channel (SPPCSPC) module, incorporating Dilated Convolution (DConv) as a central component for extracting multi-scale information. The experimental results show that YOLOv7-LDS achieves a remarkable Mean Average Precision (mAP) of 99.1% and 95.8% on the SAR Ship Detection Dataset (SSDD) and the NWPU VHR-10 dataset with a parameter count (Params) of 3.4 million, a Giga Floating Point Operations Per Second (GFLOPs) of 6.1 and an Inference Time (IT) of 4.8 milliseconds. YOLOv7-LDS effectively strikes a fine balance between computational cost and detection performance, surpassing many of the current state-of-the-art object detection models. As a result, it offers a more resilient solution for maritime ship monitoring.
当前,合成孔径雷达(SAR)船舶检测任务面临的挑战主要在于如何处理目标尺寸的显著变化以及如何管理高计算成本,这些因素阻碍了其在卫星或移动机载平台上的实际部署。针对这些挑战,本研究提出了 YOLOv7-LDS,这是一种基于 YOLOv7 框架构建的轻量级但高度精确的 SAR 船舶检测模型。在 YOLOv7-LDS 的核心架构中,我们引入了一种简化的特征提取网络,该网络在检测精度和计算效率之间取得了微妙的平衡。该网络基于 ShuffleNetv2,并采用 Squeeze-and-Excitation(SE)注意力机制作为其关键元素。此外,在 Neck 部分,我们引入了 Weighted Efficient Aggregation Network(DCW-ELAN),这是一种基本的特征提取模块,利用坐标注意力(CA)和深度卷积(DWConv)。该模块有效地聚合了特征,同时保留了识别小尺度变化的能力,确保了高质量的特征提取。此外,我们还引入了一种轻量级的 Spatial Pyramid Dilated Convolution Cross-Stage Partial Channel(LSPHDCCSPC)模块。LSPHDCCSPC 是 Spatial Pyramid Pooling Cross-Stage Partial Channel(SPPCSPC)模块的精简版本,其中包含 Dilated Convolution(DConv)作为提取多尺度信息的核心组件。实验结果表明,在 SAR 船舶检测数据集(SSDD)和 NWPU VHR-10 数据集上,YOLOv7-LDS 分别实现了 99.1%和 95.8%的平均精度(mAP),其参数量(Params)为 340 万,每秒浮点运算次数(GFLOPs)为 6.1,推理时间(IT)为 4.8 毫秒。YOLOv7-LDS 在计算成本和检测性能之间实现了有效的平衡,超越了许多当前的先进目标检测模型。因此,它为海上船舶监测提供了更具弹性的解决方案。