Pan Tao, Jiang Jiaqin, Yao Jian, Wang Bin, Tan Bin
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430000, China.
School of Artificial Intelligence, The Open University of Guangdong, Guangzhou 510000, China.
Sensors (Basel). 2020 Jul 13;20(14):3901. doi: 10.3390/s20143901.
Multi-focus image fusion has become a very practical image processing task. It uses multiple images focused on various depth planes to create an all-in-focus image. Although extensive studies have been produced, the performance of existing methods is still limited by the inaccurate detection of the focus regions for fusion. Therefore, in this paper, we proposed a novel U-shape network which can generate an accurate decision map for the multi-focus image fusion. The Siamese encoder of our U-shape network can preserve the low-level cues with rich spatial details and high-level semantic information from the source images separately. Moreover, we introduce the ResBlocks to expand the receptive field, which can enhance the ability of our network to distinguish between focus and defocus regions. Moreover, in the bridge stage between the encoder and decoder, the spatial pyramid pooling is adopted as a global perception fusion module to capture sufficient context information for the learning of the decision map. Finally, we use a hybrid loss that combines the binary cross-entropy loss and the structural similarity loss for supervision. Extensive experiments have demonstrated that the proposed method can achieve the state-of-the-art performance.
多聚焦图像融合已成为一项非常实用的图像处理任务。它利用聚焦于不同深度平面的多幅图像来创建一幅全聚焦图像。尽管已经开展了大量研究,但现有方法的性能仍受融合时聚焦区域检测不准确的限制。因此,在本文中,我们提出了一种新颖的U形网络,它能够为多聚焦图像融合生成准确的决策图。我们U形网络的暹罗编码器可以分别保留来自源图像的具有丰富空间细节的低级线索和高级语义信息。此外,我们引入残差块来扩大感受野,这可以增强我们的网络区分聚焦和散焦区域的能力。此外,在编码器和解码器之间的过渡阶段,采用空间金字塔池化作为全局感知融合模块,以捕获足够的上下文信息用于决策图的学习。最后,我们使用结合了二元交叉熵损失和结构相似性损失的混合损失进行监督。大量实验表明,所提出的方法能够实现最优性能。