Xu Pengcheng, Liu Qun, Bao Huanan, Zhang Ruhui, Gu Lihua, Wang Guoyin
IEEE Trans Image Process. 2024;33:1710-1725. doi: 10.1109/TIP.2024.3368960. Epub 2024 Mar 7.
Deep learning has excelled in single-image super-resolution (SISR) applications, yet the lack of interpretability in most deep learning-based SR networks hinders their applicability, especially in fields like medical imaging that require transparent computation. To address these problems, we present an interpretable frequency division SR network that operates in the image frequency domain. It comprises a frequency division module and a step-wise reconstruction method, which divides the image into different frequencies and performs reconstruction accordingly. We develop a frequency division loss function to ensure that each reconstruction module (ReM) operates solely at one image frequency. These methods establish an interpretable framework for SR networks, visualizing the image reconstruction process and reducing the black box nature of SR networks. Additionally, we revisited the subpixel layer upsampling process by deriving its inverse process and designing a displacement generation module. This interpretable upsampling process incorporates subpixel information and is similar to pre-upsampling frameworks. Furthermore, we develop a new ReM based on interpretable Hessian attention to enhance network performance. Extensive experiments demonstrate that our network, without the frequency division loss, outperforms state-of-the-art methods qualitatively and quantitatively. The inclusion of the frequency division loss enhances the network's interpretability and robustness, and only slightly decreases the PSNR and SSIM metrics by an average of 0.48 dB and 0.0049, respectively.
深度学习在单图像超分辨率(SISR)应用中表现出色,但大多数基于深度学习的超分辨率网络缺乏可解释性,这阻碍了它们的适用性,尤其是在医学成像等需要透明计算的领域。为了解决这些问题,我们提出了一种在图像频域中运行的可解释频分超分辨率网络。它由一个频分模块和一种逐步重建方法组成,该方法将图像划分为不同频率并相应地进行重建。我们开发了一种频分损失函数,以确保每个重建模块(ReM)仅在一个图像频率上运行。这些方法为超分辨率网络建立了一个可解释的框架,可视化图像重建过程并减少超分辨率网络的黑箱性质。此外,我们通过推导其逆过程并设计一个位移生成模块,重新审视了子像素层上采样过程。这种可解释的上采样过程结合了子像素信息,并且类似于预上采样框架。此外,我们基于可解释的黑塞矩阵注意力开发了一种新的ReM,以提高网络性能。大量实验表明,我们的网络在没有频分损失的情况下,在定性和定量方面都优于现有方法。包含频分损失增强了网络的可解释性和鲁棒性,并且仅使PSNR和SSIM指标分别平均略微降低0.48 dB和0.0049。