Li Hua, Yang Fan, Huo Junzhou, Gao Qiang, Deng Shusen, Guo Chang
School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China.
Sensors (Basel). 2025 Jun 26;25(13):3990. doi: 10.3390/s25133990.
Detecting metal surface crack defects is of great significance for the safe operation of industrial equipment. However, most existing mainstream deep-object detection models suffer from complex structures, large parameter sizes, and high training costs, which hinder their deployment and application in frontline construction sites. Therefore, this paper optimizes the existing YOLO series head structure and proposes a lightweight detection head structure, ZoomHead, with lower computational complexity and stronger detection performance. First, the GroupNorm2d module replaces the BatchNorm2d module to stabilize the model's feature distribution and accelerate the training speed. Second, Detail Enhanced Convolution (DEConv) replaces traditional convolution kernels, and shared convolution is adopted to reduce redundant structures, which enhances the ability to capture details and improves the detection performance for small objects. Next, the Zoom scale factor is introduced to achieve proportional scaling of the convolution kernels in the regression branch, minimizing redundant computational complexity. Finally, using the YOLOv10 and YOLO11 series models as baseline models, ZoomHead was used to replace the head structure of the baseline models entirely, and a series of performance comparison experiments were conducted on the rail surface crack dataset and NEU surface defect database. The results showed that the integration of ZoomHead effectively improved the model's detection accuracy, reduced the number of parameters and computations, and increased the FPS, achieving a good balance between detection accuracy and speed. In the comparative experiment of the SOTA model, the addition of ZoomHead resulted in the model having the smallest number of parameters and the highest FPS, while maintaining the same mAP value as the SOTA model, indicating that the ZoomHead structure proposed in this paper has better comprehensive detection performance.
检测金属表面裂纹缺陷对工业设备的安全运行具有重要意义。然而,现有的大多数主流深度目标检测模型存在结构复杂、参数规模大、训练成本高的问题,这阻碍了它们在一线施工现场的部署和应用。因此,本文对现有的YOLO系列头部结构进行了优化,提出了一种计算复杂度更低、检测性能更强的轻量级检测头部结构ZoomHead。首先,使用GroupNorm2d模块取代BatchNorm2d模块,以稳定模型的特征分布并加快训练速度。其次,采用细节增强卷积(DEConv)取代传统卷积核,并采用共享卷积减少冗余结构,增强了捕捉细节的能力,提高了对小目标的检测性能。接下来,引入缩放比例因子实现回归分支中卷积核的比例缩放,将冗余计算复杂度降至最低。最后,以YOLOv10和YOLO11系列模型作为基线模型,使用ZoomHead完全替换基线模型的头部结构,并在铁轨表面裂纹数据集和NEU表面缺陷数据库上进行了一系列性能对比实验。结果表明,集成ZoomHead有效提高了模型的检测精度,减少了参数数量和计算量,提高了FPS,在检测精度和速度之间取得了良好的平衡。在与SOTA模型的对比实验中,添加ZoomHead后模型的参数数量最少、FPS最高,同时保持与SOTA模型相同的mAP值,表明本文提出的ZoomHead结构具有更好的综合检测性能。