Suppr超能文献

利用深度学习进行动画电影中的角色生成与视觉质量提升

Character generation and visual quality enhancement in animated films using deep learning.

作者信息

Cao Weiran, Huang Zhongbin

机构信息

School of Art and Archaeology, Hangzhou City University, Hangzhou, 310015, China.

College of Information Technology and Convergence, Pukyong National University, Busan, South Korea.

出版信息

Sci Rep. 2025 Jul 2;15(1):23409. doi: 10.1038/s41598-025-07442-3.

Abstract

With the application and development of technologies such as artificial intelligence and deep learning in the generation of animated films, improving the quality and accuracy of generated images to enhance the visual communication effects of animated films has become an important research direction. This work aims to optimize the first order motion model (FOMM) to enhance its performance in generating animated character images. To this end, the convolutional block attention module (CBAM) is introduced into FOMM. Based on this, the CBAM is redesigned to enhance the network's ability to focus on important features, especially in terms of accuracy in complex backgrounds. Meanwhile, to address the image distortion problem caused by severe pose changes, a repainting image repair module is proposed. Through multi-scale upsampling and occlusion map prediction mechanisms, it effectively improves the coherence and completeness of image reconstruction. Ultimately, the proposed enhanced FOOM (E-FOOM) model realizes the deep coupling of attention mechanisms and reconstruction modules, and a more robust end-to-end character image generation framework is constructed. Experimental results on the VoxCeleb1 and TaiChiHD datasets show that the E-FOOM model outperforms existing models in terms of generated image quality, keypoint detection accuracy, and pose reconstruction. Additionally, the model's generated images exhibit a minimum peak signal-to-noise ratio increase of 1.11 dB and a minimum structural similarity index improvement of 0.014, indicating superior pixel-level, structural, and perceptual quality. This work intends to enhance the quality of generated character images in animated films, providing a technical pathway for achieving high-quality visual effects.

摘要

随着人工智能和深度学习等技术在动画电影生成中的应用与发展,提高生成图像的质量和准确性以增强动画电影的视觉传达效果已成为一个重要的研究方向。这项工作旨在优化一阶运动模型(FOMM),以提高其在生成动画角色图像方面的性能。为此,将卷积块注意力模块(CBAM)引入到FOMM中。在此基础上,对CBAM进行重新设计,以增强网络关注重要特征的能力,特别是在复杂背景下的准确性方面。同时,为了解决因剧烈姿势变化导致的图像失真问题,提出了一种重绘图像修复模块。通过多尺度上采样和遮挡图预测机制,有效提高了图像重建的连贯性和完整性。最终,所提出的增强型FOOM(E-FOOM)模型实现了注意力机制与重建模块的深度耦合,并构建了一个更强大的端到端角色图像生成框架。在VoxCeleb1和TaiChiHD数据集上的实验结果表明,E-FOOM模型在生成图像质量、关键点检测准确性和姿势重建方面优于现有模型。此外,该模型生成的图像的最小峰值信噪比提高了1.11 dB,最小结构相似性指数提高了0.014,表明在像素级、结构和感知质量方面具有优越性。这项工作旨在提高动画电影中生成角色图像的质量,为实现高质量视觉效果提供了一条技术途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ac5/12223153/8cc04b91ee3a/41598_2025_7442_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验