Suppr超能文献

基于语义注意力引导拉普拉斯金字塔的轻量级深度示例图像上色

Lightweight Deep Exemplar Colorization via Semantic Attention-Guided Laplacian Pyramid.

作者信息

Zou Chengyi, Wan Shuai, Blanch Marc Gorriz, Murn Luka, Mrak Marta, Sock Juil, Yang Fei, Herranz Luis

出版信息

IEEE Trans Vis Comput Graph. 2025 Aug;31(8):4257-4269. doi: 10.1109/TVCG.2024.3398791.

Abstract

Exemplar-based colorization aims to generate plausible colors for a grayscale image with the guidance of a color reference image. The main challenging problem is finding the correct semantic correspondence between the target image and the reference image. However, the colors of the object and background are often confused in the existing methods. Besides, these methods usually use simple encoder-decoder architectures or pyramid structures to extract features and lack appropriate fusion mechanisms, which results in the loss of high-frequency information or high complexity. To address these problems, this article proposes a lightweight semantic attention-guided Laplacian pyramid network (SAGLP-Net) for deep exemplar-based colorization, exploiting the inherent multi-scale properties of color representations. They are exploited through a Laplacian pyramid, and semantic information is introduced as high-level guidance to align the object and background information. Specially, a semantic guided non-local attention fusion module is designed to exploit the long-range dependency and fuse the local and global features. Moreover, a Laplacian pyramid fusion module based on criss-cross attention is proposed to fuse high frequency components in the large-scale domain. An unsupervised multi-scale multi-loss training strategy is further introduced for network training, which combines pixel loss, color histogram loss, total variance regularisation, and adversarial loss. Experimental results demonstrate that our colorization method achieves better subjective and objective performance with lower complexity than the state-of-the-art methods.

摘要

基于样本的图像上色旨在在彩色参考图像的引导下为灰度图像生成合理的颜色。主要的挑战性问题是找到目标图像和参考图像之间正确的语义对应关系。然而,在现有方法中,物体和背景的颜色常常混淆。此外,这些方法通常使用简单的编码器-解码器架构或金字塔结构来提取特征,并且缺乏适当的融合机制,这导致高频信息丢失或复杂度较高。为了解决这些问题,本文提出了一种轻量级语义注意力引导的拉普拉斯金字塔网络(SAGLP-Net)用于基于深度样本的图像上色,利用颜色表示固有的多尺度特性。通过拉普拉斯金字塔来利用这些特性,并引入语义信息作为高级指导来对齐物体和背景信息。具体来说,设计了一个语义引导的非局部注意力融合模块来利用长距离依赖性并融合局部和全局特征。此外,还提出了一种基于十字交叉注意力的拉普拉斯金字塔融合模块来在大规模域中融合高频分量。进一步引入了一种无监督的多尺度多损失训练策略用于网络训练,该策略结合了像素损失、颜色直方图损失、总方差正则化和对抗损失。实验结果表明,我们的上色方法与现有最先进方法相比,以更低的复杂度实现了更好的主观和客观性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验