Cheng Shun, Wang Zhiqian, Liu Shaojin, Han Yan, Sun Pengtao, Li Jianrong
Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Relative Pose Precision Measurement Laboratory, Jilin 130033, China.
Graduate School, University of Chinese Academy of Sciences, Beijing 100049, China.
Sensors (Basel). 2024 Nov 29;24(23):7640. doi: 10.3390/s24237640.
Underwater object detection is highly complex and requires a high speed and accuracy. In this paper, an underwater target detection model based on YOLOv8 (SPSM-YOLOv8) is proposed. It solves the problems of high computational complexities, slow detection speeds and low accuracies. Firstly, the SPDConv module is utilized in the backbone network to replace the standard convolutional module for feature extraction. This enhances computational efficiency and reduces redundant computations. Secondly, the PSA (Polarized Self-Attention) mechanism is added to filter and enhance the polarization of features in the channel and spatial dimensions to improve the accuracy of pixel-level prediction. The SCDown (spatial-channel decoupled downsampling) downsampling mechanism is then introduced to reduce the computational cost by decoupling the space and channel operations while retaining the information in the downsampling process. Finally, MPDIoU (Minimum Point Distance-based IoU) is used to replace the CIoU (Complete-IOU) loss function to accelerate the convergence speed of the bounding box and improve the bounding box regression accuracy. The experimental results show that compared with the YOLOv8n baseline model, the SPSM-YOLOv8 (SPDConv-PSA-SCDown-MPDIoU-YOLOv8) detection accuracy reaches 87.3% on the ROUD dataset and 76.4% on the UPRC2020 dataset, and the number of parameters and amount of computation decrease by 4.3% and 4.9%, respectively. The detection frame rate reaches 189 frames per second on the ROUD dataset, thus meeting the high accuracy requirements for underwater object detection algorithms and facilitating lightweight and fast edge deployment.
水下目标检测非常复杂,需要高速和高精度。本文提出了一种基于YOLOv8的水下目标检测模型(SPSM-YOLOv8)。它解决了计算复杂度高、检测速度慢和精度低的问题。首先,在主干网络中使用SPDConv模块来代替标准卷积模块进行特征提取。这提高了计算效率并减少了冗余计算。其次,添加PSA(极化自注意力)机制来过滤和增强通道和空间维度上特征的极化,以提高像素级预测的准确性。然后引入SCDown(空间通道解耦下采样)下采样机制,通过解耦空间和通道操作来降低计算成本,同时在降采样过程中保留信息。最后,使用MPDIoU(基于最小点距离的IoU)来代替CIoU(完整IoU)损失函数,以加快边界框的收敛速度并提高边界框回归精度。实验结果表明,与YOLOv8n基线模型相比,SPSM-YOLOv8(SPDConv-PSA-SCDown-MPDIoU-YOLOv8)在ROUD数据集上的检测准确率达到87.3%,在UPRC2020数据集上达到76.4%,参数数量和计算量分别减少了4.3%和4.9%。在ROUD数据集上检测帧率达到每秒189帧,从而满足水下目标检测算法的高精度要求,并便于轻量级和快速的边缘部署。