Wang Xiaoge, Sheng Yunlong, Hao Qun, Hou Haiyuan, Nie Suzhen
School of Mechanical Engineering, Shandong University of Technology, Zibo 255000, China.
Changchun University of Science and Technology, Changchun 130022, China.
Biomimetics (Basel). 2025 Jul 8;10(7):451. doi: 10.3390/biomimetics10070451.
To address challenges of background interference and limited multi-scale feature extraction in infrared small target detection, this paper proposes a YOLO-HVS detection algorithm inspired by the human visual system. Based on YOLOv8, we design a multi-scale spatially enhanced attention module (MultiSEAM) using multi-branch depth-separable convolution to suppress background noise and enhance occluded targets, integrating local details and global context. Meanwhile, the C2f_DWR (dilation-wise residual) module with regional-semantic dual residual structure is designed to significantly improve the efficiency of capturing multi-scale contextual information by expanding convolution and two-step feature extraction mechanism. We construct the DroneRoadVehicles dataset containing 1028 infrared images captured at 70-300 m, covering complex occlusion and multi-scale targets. Experiments show that YOLO-HVS achieves mAP50 of 83.4% and 97.8% on the public dataset DroneVehicle and the self-built dataset, respectively, which is an improvement of 1.1% and 0.7% over the baseline YOLOv8, and the number of model parameters only increases by 2.3 M, and the increase of GFLOPs is controlled at 0.1 G. The experimental results demonstrate that the proposed approach exhibits enhanced robustness in detecting targets under severe occlusion and low SNR conditions, while enabling efficient real-time infrared small target detection.
为解决红外小目标检测中背景干扰和多尺度特征提取受限的挑战,本文提出一种受人类视觉系统启发的YOLO-HVS检测算法。基于YOLOv8,我们设计了一种多尺度空间增强注意力模块(MultiSEAM),利用多分支深度可分离卷积抑制背景噪声并增强被遮挡目标,整合局部细节和全局上下文。同时,设计了具有区域语义双重残差结构的C2f_DWR(扩张式残差)模块,通过扩张卷积和两步特征提取机制显著提高捕获多尺度上下文信息的效率。我们构建了包含1028张在70 - 300米处拍摄的红外图像的DroneRoadVehicles数据集,涵盖复杂遮挡和多尺度目标。实验表明,YOLO-HVS在公共数据集DroneVehicle和自建数据集上分别实现了83.4%和97.8%的mAP50,比基线YOLOv8分别提高了1.1%和0.7%,且模型参数仅增加2.3M,GFLOPs的增加控制在0.1G。实验结果表明,所提方法在检测严重遮挡和低信噪比条件下的目标时具有更强的鲁棒性,同时能够实现高效实时的红外小目标检测。