Zhu Xuanying, Lin Mugang, Yi Mengting, Zhao Huihuang
College of Computer Science and Technology, Hengyang Normal University, Hengyang, China.
Hunan Vocational Institute of Technology, Xiangtan, Hunan, China.
Sci Rep. 2024 Nov 28;14(1):29584. doi: 10.1038/s41598-024-81249-6.
Architectural photography style transfer, a task in computer vision, employs deep learning algorithms to transform the style of architectural photograph while preserving key structure and content. Existing algorithms face challenges due to the intricate details of buildings, including diverse shapes, lines, and textures. Moreover, considerations for artistic effects in architectural photography style transfer, such as lighting, shadows, and atmosphere, require high-quality image generation algorithms. However, current algorithms often struggle to address these complexities, leading to loss or blurring of details and less realistic images. To overcome these challenges, this paper proposes a Photorealistic Attention Style Transfer Network. The proposed approach utilizes a semantic segmentation model to accurately segment the input image into foreground and background components for independent style transfer. Subsequently, the transferred images are refined by focusing on intricate building parts using the coordinate attention mechanism. Additionally, the network incorporates loss functions to capture light, shadow, and colors in stylish images, ensuring realism while maintaining aesthetic appeal. Through comparative experiments, the proposed network shows better performance in terms of image fidelity and overall aesthetics, and the SSIM and PSNR indices are also better than the current mainstream methods.
建筑摄影风格迁移是计算机视觉中的一项任务,它采用深度学习算法在保留关键结构和内容的同时变换建筑照片的风格。由于建筑物的细节复杂,包括多样的形状、线条和纹理,现有算法面临挑战。此外,建筑摄影风格迁移中对艺术效果的考量,如光照、阴影和氛围,需要高质量的图像生成算法。然而,当前算法往往难以应对这些复杂性,导致细节丢失或模糊以及图像不够逼真。为克服这些挑战,本文提出了一种逼真注意力风格迁移网络。所提出的方法利用语义分割模型将输入图像准确分割为前景和背景组件以进行独立的风格迁移。随后,通过使用坐标注意力机制聚焦于复杂的建筑部件来对迁移后的图像进行细化。此外,该网络纳入损失函数以捕捉时尚图像中的光照、阴影和颜色,在保持美学吸引力的同时确保逼真度。通过对比实验,所提出的网络在图像保真度和整体美学方面表现出更好的性能,并且结构相似性指数(SSIM)和峰值信噪比(PSNR)指标也优于当前主流方法。