GlobalSR：基于可变形卷积注意力和快速傅里叶卷积的单图像超分辨率全局上下文网络。

GlobalSR: Global context network for single image super-resolution via deformable convolution attention and fast Fourier convolution.

机构信息

School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, 510275, China.

School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China.

出版信息

Neural Netw. 2024 Dec;180:106686. doi: 10.1016/j.neunet.2024.106686. Epub 2024 Aug 31.

DOI:10.1016/j.neunet.2024.106686

PMID:39260011

Abstract

Vision Transformer have achieved impressive performance in image super-resolution. However, they suffer from low inference speed mainly because of the quadratic complexity of multi-head self-attention (MHSA), which is the key to learning long-range dependencies. On the contrary, most CNN-based methods neglect the important effect of global contextual information, resulting in inaccurate and blurring details. If one can make the best of both Transformers and CNNs, it will achieve a better trade-off between image quality and inference speed. Based on this observation, firstly assume that the main factor affecting the performance in the Transformer-based SR models is the general architecture design, not the specific MHSA component. To verify this, some ablation studies are made by replacing MHSA with large kernel convolutions, alongside other essential module replacements. Surprisingly, the derived models achieve competitive performance. Therefore, a general architecture design GlobalSR is extracted by not specifying the core modules including blocks and domain embeddings of Transformer-based SR models. It also contains three practical guidelines for designing a lightweight SR network utilizing image-level global contextual information to reconstruct SR images. Following the guidelines, the blocks and domain embeddings of GlobalSR are instantiated via Deformable Convolution Attention Block (DCAB) and Fast Fourier Convolution Domain Embedding (FCDE), respectively. The instantiation of general architecture, termed GlobalSR-DF, proposes a DCA to extract the global contextual feature by utilizing Deformable Convolution and a Hadamard product as the attention map at the block level. Meanwhile, the FCDE utilizes the Fast Fourier to transform the input spatial feature into frequency space and then extract image-level global information from it by convolutions. Extensive experiments demonstrate that GlobalSR is the key part in achieving a superior trade-off between SR quality and efficiency. Specifically, our proposed GlobalSR-DF outperforms state-of-the-art CNN-based and ViT-based SISR models regarding accuracy-speed trade-offs with sharp and natural details.

摘要

视觉Transformer 在图像超分辨率方面取得了令人瞩目的性能。然而，它们的推理速度较慢，主要是因为多头自注意力（Multi-Head Self-Attention，MHSA）的二次复杂度是学习长距离依赖关系的关键。相比之下，大多数基于卷积神经网络（Convolutional Neural Network，CNN）的方法忽略了全局上下文信息的重要影响，导致细节不准确和模糊。如果能够充分利用 Transformer 和 CNN 的优势，将在图像质量和推理速度之间取得更好的平衡。基于这一观察，首先假设影响基于 Transformer 的 SR 模型性能的主要因素是一般的架构设计，而不是特定的 MHSA 组件。为了验证这一点，通过用大核卷积替换 MHSA 以及其他必要的模块替换，进行了一些消融研究。令人惊讶的是，得到的模型取得了有竞争力的性能。因此，通过不指定基于 Transformer 的 SR 模型的核心模块（包括块和域嵌入），提取了一个通用架构设计 GlobalSR。它还包含了利用图像级全局上下文信息重建 SR 图像的三个实用设计指南。遵循这些指南，通过可变形卷积注意力块（Deformable Convolution Attention Block，DCAB）和快速傅里叶卷积域嵌入（Fast Fourier Convolution Domain Embedding，FCDE）分别实例化 GlobalSR 的块和域嵌入。通用架构的实例化，称为 GlobalSR-DF，通过利用可变形卷积和 Hadamard 积作为块级别的注意力图，提出了一个 DCA 来提取全局上下文特征。同时，FCDE 利用快速傅里叶变换将输入的空间特征转换到频域，并通过卷积从中提取图像级全局信息。广泛的实验证明，GlobalSR 是在 SR 质量和效率之间取得卓越平衡的关键部分。具体来说，我们提出的 GlobalSR-DF 在准确性-速度权衡方面优于最先进的基于 CNN 和基于 ViT 的 SISR 模型，具有锐利和自然的细节。

相似文献

GlobalSR: Global context network for single image super-resolution via deformable convolution attention and fast Fourier convolution.

Neural Netw. 2024 Dec;180:106686. doi: 10.1016/j.neunet.2024.106686. Epub 2024 Aug 31.

RepECN: Making ConvNets Better Again for Efficient Image Super-Resolution.

Sensors (Basel). 2023 Dec 2;23(23):9575. doi: 10.3390/s23239575.

Deep local-to-global feature learning for medical image super-resolution.

Comput Med Imaging Graph. 2024 Jul;115:102374. doi: 10.1016/j.compmedimag.2024.102374. Epub 2024 Mar 26.

VSmTrans: A hybrid paradigm integrating self-attention and convolution for 3D medical image segmentation.

Med Image Anal. 2024 Dec;98:103295. doi: 10.1016/j.media.2024.103295. Epub 2024 Aug 24.

A 3D hierarchical cross-modality interaction network using transformers and convolutions for brain glioma segmentation in MR images.

Med Phys. 2024 Nov;51(11):8371-8389. doi: 10.1002/mp.17354. Epub 2024 Aug 13.

Lightweight Single Image Super-Resolution with Selective Channel Processing Network.

Sensors (Basel). 2022 Jul 26;22(15):5586. doi: 10.3390/s22155586.

Dual-space high-frequency learning for transformer-based MRI super-resolution.

Comput Methods Programs Biomed. 2024 Jun;250:108165. doi: 10.1016/j.cmpb.2024.108165. Epub 2024 Apr 9.

TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation.

Comput Biol Med. 2023 Dec;167:107583. doi: 10.1016/j.compbiomed.2023.107583. Epub 2023 Oct 21.

ETU-Net: edge enhancement-guided U-Net with transformer for skin lesion segmentation.

Phys Med Biol. 2023 Dec 22;69(1). doi: 10.1088/1361-6560/ad13d2.

Image super-resolution with an enhanced group convolutional neural network.

Neural Netw. 2022 Sep;153:373-385. doi: 10.1016/j.neunet.2022.06.009. Epub 2022 Jun 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

GlobalSR：基于可变形卷积注意力和快速傅里叶卷积的单图像超分辨率全局上下文网络。

GlobalSR: Global context network for single image super-resolution via deformable convolution attention and fast Fourier convolution.

机构信息

School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, 510275, China.

School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China.

出版信息

Neural Netw. 2024 Dec;180:106686. doi: 10.1016/j.neunet.2024.106686. Epub 2024 Aug 31.

DOI:10.1016/j.neunet.2024.106686

PMID:39260011

Abstract

摘要

GlobalSR：基于可变形卷积注意力和快速傅里叶卷积的单图像超分辨率全局上下文网络。

GlobalSR: Global context network for single image super-resolution via deformable convolution attention and fast Fourier convolution.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

GlobalSR：基于可变形卷积注意力和快速傅里叶卷积的单图像超分辨率全局上下文网络。

GlobalSR: Global context network for single image super-resolution via deformable convolution attention and fast Fourier convolution.

机构信息

出版信息

相似文献