基于简单混合 CNN-Transformer 网络的图像调和。

Image harmonization with Simple Hybrid CNN-Transformer Network.

机构信息

School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shannxi, China; School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi'an, 710072, Shannxi, China.

School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi'an, 710072, Shannxi, China; Key Laboratory of Intelligent Interaction and Application (Northwestern Polytechnical University), Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an, 710072, Shannxi, China.

出版信息

Neural Netw. 2024 Dec;180:106673. doi: 10.1016/j.neunet.2024.106673. Epub 2024 Aug 30.

DOI:10.1016/j.neunet.2024.106673

PMID:39260009

Abstract

Image harmonization seeks to transfer the illumination distribution of the background to that of the foreground within a composite image. Existing methods lack the ability of establishing global-local pixel illumination dependencies between foreground and background of composite images, which is indispensable for sharp and color-consistent harmonized image generation. To overcome this challenge, we design a novel Simple Hybrid CNN-Transformer Network (SHT-Net), which is formulated into an efficient symmetrical hierarchical architecture. It is composed of two newly designed light-weight Transformer blocks. Firstly, the scale-aware gated block is designed to capture multi-scale features through different heads and expand the receptive fields, which facilitates to generate images with fine-grained details. Secondly, we introduce a simple parallel attention block, which integrates the window-based self-attention and gated channel attention in parallel, resulting in simultaneously global-local pixel illumination relationship modeling capability. Besides, we propose an efficient simple feed forward network to filter out less informative features and allow the features to contribute to generating photo-realistic harmonized results passing through. Extensive experiments on image harmonization benchmarks indicate that our method achieve promising quantitative and qualitative results. The code and pre-trained models are available at https://github.com/guanguanboy/SHT-Net.

摘要

图像调和旨在将复合图像中背景的光照分布转移到前景的光照分布。现有的方法缺乏在复合图像的前景和背景之间建立全局-局部像素光照依赖关系的能力，这对于生成清晰和颜色一致的调和图像是必不可少的。为了克服这一挑战，我们设计了一种新颖的简单混合 CNN-Transformer 网络（SHT-Net），它被构建成一个高效的对称分层架构。它由两个新设计的轻量级 Transformer 块组成。首先，设计了尺度感知门控块，通过不同的头捕获多尺度特征，并扩展感受野，从而有利于生成具有细粒度细节的图像。其次，我们引入了一个简单的并行注意力块，它将基于窗口的自注意力和门控通道注意力并行集成，从而同时具有全局-局部像素光照关系建模能力。此外，我们提出了一种有效的简单前馈网络，用于过滤掉信息量较少的特征，并允许特征通过传递生成逼真的调和结果。在图像调和基准上的广泛实验表明，我们的方法在定量和定性方面都取得了有希望的结果。代码和预训练模型可在 https://github.com/guanguanboy/SHT-Net 上获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于简单混合 CNN-Transformer 网络的图像调和。

Image harmonization with Simple Hybrid CNN-Transformer Network.

机构信息

出版信息

相似文献

基于简单混合 CNN-Transformer 网络的图像调和。

Image harmonization with Simple Hybrid CNN-Transformer Network.

机构信息

出版信息

相似文献