Zhu Pan, Ouyang Wanqi, Guo Yongxing, Zhou Xinglin
Key Laboratory of Metallurgical Equipment and Control Technology, Ministry of Education, Wuhan University of Science and Technology, Wuhan, China.
Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan, China.
Front Bioeng Biotechnol. 2022 Jul 14;10:923364. doi: 10.3389/fbioe.2022.923364. eCollection 2022.
The image fusion algorithm has great application value in the domain of computer vision, which makes the fused image have a more comprehensive and clearer description of the scene, and is beneficial to human eye recognition and automatic mechanical detection. In recent years, image fusion algorithms have achieved great success in different domains. However, it still has huge challenges in terms of the generalization of multi-modal image fusion. In reaction to this problem, this paper proposes a general image fusion framework based on an improved convolutional neural network. Firstly, the feature information of the input image is captured by the multiple feature extraction layers, and then multiple feature maps are stacked along the number of channels to acquire the feature fusion map. Finally, feature maps, which are derived from multiple feature extraction layers, are stacked in high dimensions by skip connection and convolution filtering for reconstruction to produce the final result. In this paper, multi-modal images are gained from multiple datasets to produce a large sample space to adequately train the network. Compared with the existing convolutional neural networks and traditional fusion algorithms, the proposed model not only has generality and stability but also has some strengths in subjective visualization and objective evaluation, while the average running time is at least 94% faster than the reference algorithm based on neural network.
图像融合算法在计算机视觉领域具有重要的应用价值,它能使融合后的图像对场景有更全面、清晰的描述,有利于人眼识别和自动机械检测。近年来,图像融合算法在不同领域取得了巨大成功。然而,在多模态图像融合的泛化方面仍面临巨大挑战。针对这一问题,本文提出了一种基于改进卷积神经网络的通用图像融合框架。首先,通过多个特征提取层捕获输入图像的特征信息,然后沿通道数堆叠多个特征图以获取特征融合图。最后,来自多个特征提取层的特征图通过跳跃连接和卷积滤波在高维上堆叠进行重构,以产生最终结果。本文从多个数据集中获取多模态图像,以生成一个大的样本空间来充分训练网络。与现有的卷积神经网络和传统融合算法相比,所提出的模型不仅具有通用性和稳定性,而且在主观可视化和客观评估方面具有一定优势,同时平均运行时间比基于神经网络的参考算法至少快94%。