基于改进循环生成对抗网络（CycleGAN）算法的视觉资源提取与艺术传播模型设计

Visual resource extraction and artistic communication model design based on improved CycleGAN algorithm.

作者信息

Yang Anyu, Kashif Hanif Muhammad

机构信息

International School of Arts, Dalian University of Foreign Languages, Dalian, Liaoning, China.

Department of Computer Science, Government College University, Faisalabad, Pakistan.

出版信息

PeerJ Comput Sci. 2024 Mar 18;10:e1889. doi: 10.7717/peerj-cs.1889. eCollection 2024.

DOI:10.7717/peerj-cs.1889

PMID:38660158

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11041928/

Abstract

Through the application of computer vision and deep learning methodologies, real-time style transfer of images becomes achievable. This process involves the fusion of diverse artistic elements into a single image, resulting in the creation of innovative pieces of art. This article centers its focus on image style transfer within the realm of art education and introduces an ATT-CycleGAN model enriched with an attention mechanism to enhance the quality and precision of style conversion. The framework enhances the generators within CycleGAN. At first, images undergo encoder downsampling before entering the intermediate transformation model. In this intermediate transformation model, feature maps are acquired through four encoding residual blocks, which are subsequently input into an attention module. Channel attention is incorporated through multi-weight optimization achieved global max-pooling and global average-pooling techniques. During the model's training process, transfer learning techniques are employed to improve model parameter initialization, enhancing training efficiency. Experimental results demonstrate the superior performance of the proposed model in image style transfer across various categories. In comparison to the traditional CycleGAN model, it exhibits a notable increase in structural similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) metrics. Specifically, on the Places365 and selfi2anime datasets, compared with the traditional CycleGAN model, SSIM is increased by 3.19% and 1.31% respectively, and PSNR is increased by 10.16% and 5.02% respectively. These findings provide valuable algorithmic support and crucial references for future research in the fields of art education, image segmentation, and style transfer.

摘要

通过应用计算机视觉和深度学习方法，图像的实时风格迁移变得可行。这个过程涉及将多样的艺术元素融合到单个图像中，从而创作出创新的艺术作品。本文聚焦于艺术教育领域内的图像风格迁移，并引入了一种富含注意力机制的ATT-CycleGAN模型，以提高风格转换的质量和精度。该框架增强了CycleGAN中的生成器。首先，图像在进入中间转换模型之前要经过编码器下采样。在这个中间转换模型中，通过四个编码残差块获取特征图，随后将其输入到注意力模块中。通过全局最大池化和全局平均池化技术实现的多权重优化纳入通道注意力。在模型的训练过程中，采用迁移学习技术来改进模型参数初始化，提高训练效率。实验结果表明，所提出的模型在各类图像风格迁移中具有卓越的性能。与传统的CycleGAN模型相比，它在结构相似性指数测量（SSIM）和峰值信噪比（PSNR）指标上有显著提高。具体而言，在Places365和selfi2anime数据集上，与传统的CycleGAN模型相比，SSIM分别提高了3.19%和1.31%，PSNR分别提高了10.16%和5.02%。这些发现为艺术教育、图像分割和风格迁移领域的未来研究提供了有价值的算法支持和关键参考。