DCFNet：基于离散小波变换和卷积神经网络的红外与可见光图像融合网络

DCFNet: Infrared and Visible Image Fusion Network Based on Discrete Wavelet Transform and Convolutional Neural Network.

作者信息

Wu Dan, Wang Yanzhi, Wang Haoran, Wang Fei, Gao Guowang

机构信息

School of Electronic Engineering, Xi'an Shiyou University, Xi'an 710312, China.

出版信息

Sensors (Basel). 2024 Jun 22;24(13):4065. doi: 10.3390/s24134065.

DOI:10.3390/s24134065

PMID:39000844

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11244297/

Abstract

Aiming to address the issues of missing detailed information, the blurring of significant target information, and poor visual effects in current image fusion algorithms, this paper proposes an infrared and visible-light image fusion algorithm based on discrete wavelet transform and convolutional neural networks. Our backbone network is an autoencoder. A DWT layer is embedded in the encoder to optimize frequency-domain feature extraction and prevent information loss, and a bottleneck residual block and a coordinate attention mechanism are introduced to enhance the ability to capture and characterize the low- and high-frequency feature information; an IDWT layer is embedded in the decoder to achieve the feature reconstruction of the fused frequencies; the fusion strategy adopts the l1-norm fusion strategy to integrate the encoder's output frequency mapping features; a weighted loss containing pixel loss, gradient loss, and structural loss is constructed for optimizing network training. DWT decomposes the image into sub-bands at different scales, including low-frequency sub-bands and high-frequency sub-bands. The low-frequency sub-bands contain the structural information of the image, which corresponds to the important target information, while the high-frequency sub-bands contain the detail information, such as edge and texture information. Through IDWT, the low-frequency sub-bands that contain important target information are synthesized with the high-frequency sub-bands that enhance the details, ensuring that the important target information and texture details are clearly visible in the reconstructed image. The whole process is able to reconstruct the information of different frequency sub-bands back into the image non-destructively, so that the fused image appears natural and harmonious visually. Experimental results on public datasets show that the fusion algorithm performs well according to both subjective and objective evaluation criteria and that the fused image is clearer and contains more scene information, which verifies the effectiveness of the algorithm, and the results of the generalization experiments also show that our network has good generalization ability.

摘要

针对当前图像融合算法中存在的细节信息缺失、重要目标信息模糊以及视觉效果不佳等问题，本文提出了一种基于离散小波变换和卷积神经网络的红外与可见光图像融合算法。我们的骨干网络是一个自动编码器。在编码器中嵌入一个离散小波变换（DWT）层，以优化频域特征提取并防止信息丢失，并引入一个瓶颈残差块和一个坐标注意力机制来增强捕获和表征低频和高频特征信息的能力；在解码器中嵌入一个逆离散小波变换（IDWT）层，以实现融合频率的特征重建；融合策略采用l1范数融合策略来整合编码器的输出频率映射特征；构建一个包含像素损失、梯度损失和结构损失的加权损失来优化网络训练。离散小波变换将图像分解为不同尺度的子带，包括低频子带和高频子带。低频子带包含图像的结构信息，对应于重要目标信息，而高频子带包含细节信息，如边缘和纹理信息。通过逆离散小波变换，将包含重要目标信息的低频子带与增强细节的高频子带进行合成，确保在重建图像中重要目标信息和纹理细节清晰可见。整个过程能够将不同频率子带的信息无损地重建回图像中，使得融合后的图像在视觉上显得自然和谐。在公共数据集上的实验结果表明，该融合算法在主观和客观评价标准下均表现良好，融合后的图像更清晰且包含更多场景信息，验证了算法的有效性，泛化实验结果也表明我们的网络具有良好的泛化能力。