用于图像恢复的单阶段自适应多注意力网络

Single Stage Adaptive Multi-Attention Network for Image Restoration.

作者信息

Zafar Anas, Aftab Danyal, Qureshi Rizwan, Fan Xinqi, Chen Pingjun, Wu Jia, Ali Hazrat, Nawaz Shah, Khan Sheheryar, Shah Mubarak

出版信息

IEEE Trans Image Process. 2024;33:2924-2935. doi: 10.1109/TIP.2024.3384838. Epub 2024 Apr 23.

DOI:10.1109/TIP.2024.3384838

PMID:38598372

Abstract

Recently attention-based networks have been successful for image restoration tasks. However, existing methods are either computationally expensive or have limited receptive fields, adding constraints to the model. They are also less resilient in spatial and contextual aspects and lack pixel-to-pixel correspondence, which may degrade feature representations. In this paper, we propose a novel and computationally efficient architecture Single Stage Adaptive Multi-Attention Network (SSAMAN) for image restoration tasks, particularly for image denoising and image deblurring. SSAMAN efficiently addresses computational challenges and expands receptive fields, enhancing robustness in spatial and contextual feature representation. Its Adaptive Multi-Attention Module (AMAM), which consists of Adaptive Pixel Attention Branch (APAB) and an Adaptive Channel Attention Branch (ACAB), uniquely integrates channel and pixel-wise dimensions, significantly improving sensitivity to edges, shapes, and textures. We perform extensive experiments and ablation studies to validate the performance of SSAMAN. Our model shows state-of-the-art results on various benchmarks, for example, on image denoising tasks, SSAMAN achieves a notable 40.08 dB PSNR on SIDD dataset, outperforming Restormer by 0.06 dB PSNR, with 41.02% less computational cost, and achieves a 40.05 dB PSNR on the DND dataset. For image deblurring, SSAMAN achieves 33.53 dB PSNR on GoPro dataset. Code and models are available at Github.

摘要

最近，基于注意力的网络在图像恢复任务中取得了成功。然而，现有方法要么计算成本高昂，要么感受野有限，给模型带来了限制。它们在空间和上下文方面的弹性也较差，并且缺乏像素到像素的对应关系，这可能会降低特征表示。在本文中，我们提出了一种新颖且计算高效的架构——单阶段自适应多注意力网络（SSAMAN），用于图像恢复任务，特别是图像去噪和图像去模糊。SSAMAN有效地解决了计算挑战并扩大了感受野，增强了空间和上下文特征表示的鲁棒性。其自适应多注意力模块（AMAM）由自适应像素注意力分支（APAB）和自适应通道注意力分支（ACAB）组成，独特地整合了通道和逐像素维度，显著提高了对边缘、形状和纹理的敏感度。我们进行了广泛的实验和消融研究来验证SSAMAN的性能。我们的模型在各种基准测试中展示了领先的结果，例如，在图像去噪任务中，SSAMAN在SIDD数据集上实现了显著的40.08 dB峰值信噪比（PSNR），比Restormer高出0.06 dB PSNR，计算成本降低了41.02%，并且在DND数据集上实现了40.05 dB PSNR。对于图像去模糊，SSAMAN在GoPro数据集上实现了33.53 dB PSNR。代码和模型可在Github上获取。