Fan Xiangsuo, Ding Wentao, Li Xuyang, Li Tingting, Hu Bo, Shi Yuqiu
School of Automation, Guangxi University of Science and Technology, Liuzhou 545006, China.
Guangxi Collaborative Innovation Centre for Earthmoving Machinery, Guangxi University of Science and Technology, Liuzhou 545006, China.
Sensors (Basel). 2024 Jun 29;24(13):4227. doi: 10.3390/s24134227.
Infrared small target detection technology plays a crucial role in various fields such as military reconnaissance, power patrol, medical diagnosis, and security. The advancement of deep learning has led to the success of convolutional neural networks in target segmentation. However, due to challenges like small target scales, weak signals, and strong background interference in infrared images, convolutional neural networks often face issues like leakage and misdetection in small target segmentation tasks. To address this, an enhanced U-Net method called MST-UNet is proposed, the method combines multi-scale feature decomposition and fusion and attention mechanisms. The method involves using Haar wavelet transform instead of maximum pooling for downsampling in the encoder to minimize feature loss and enhance feature utilization. Additionally, a multi-scale residual unit is introduced to extract contextual information at different scales, improving sensory field and feature expression. The inclusion of a triple attention mechanism in the encoder structure further enhances multidimensional information utilization and feature recovery by the decoder. Experimental analysis on the NUDT-SIRST dataset demonstrates that the proposed method significantly improves target contour accuracy and segmentation precision, achieving IoU and nIoU values of 80.09% and 80.19%, respectively.
红外小目标检测技术在军事侦察、电力巡检、医学诊断和安全等各个领域发挥着至关重要的作用。深度学习的发展使得卷积神经网络在目标分割方面取得了成功。然而,由于红外图像中存在小目标尺度、信号微弱和背景干扰强烈等挑战,卷积神经网络在小目标分割任务中常常面临漏检和误检等问题。为了解决这一问题,提出了一种名为MST-UNet的增强型U-Net方法,该方法将多尺度特征分解与融合以及注意力机制相结合。该方法在编码器中使用 Haar 小波变换代替最大池化进行下采样,以最小化特征损失并提高特征利用率。此外,引入了多尺度残差单元来提取不同尺度的上下文信息,改善感受野和特征表达。在编码器结构中加入三重注意力机制进一步增强了解码器对多维信息的利用和特征恢复能力。在NUDT-SIRST数据集上的实验分析表明,该方法显著提高了目标轮廓精度和分割精度,IoU和nIoU值分别达到了80.09%和80.19%。