视频中面部编辑的身份感知与形状感知传播

Identity-Aware and Shape-Aware Propagation of Face Editing in Videos.

作者信息

Jiang Yue-Ren, Chen Shu-Yu, Fu Hongbo, Gao Lin

出版信息

IEEE Trans Vis Comput Graph. 2024 Jul;30(7):3444-3456. doi: 10.1109/TVCG.2023.3235364. Epub 2024 Jun 27.

DOI:10.1109/TVCG.2023.3235364

Abstract

The development of deep generative models has inspired various facial image editing methods, but many of them are difficult to be directly applied to video editing due to various challenges ranging from imposing 3D constraints, preserving identity consistency, ensuring temporal coherence, etc. To address these challenges, we propose a new framework operating on the StyleGAN2 latent space for identity-aware and shape-aware edit propagation on face videos. In order to reduce the difficulties of maintaining the identity, keeping the original 3D motion, and avoiding shape distortions, we disentangle the StyleGAN2 latent vectors of human face video frames to decouple the appearance, shape, expression, and motion from identity. An edit encoding module is used to map a sequence of image frames to continuous latent codes with 3D parametric control and is trained in a self-supervised manner with identity loss and triple shape losses. Our model supports propagation of edits in various forms: I. direct appearance editing on a specific keyframe, II. implicit editing of face shape via a given reference image, and III. existing latent-based semantic edits. Experiments show that our method works well for various forms of videos in the wild and outperforms an animation-based approach and the recent deep generative techniques.

摘要

深度生成模型的发展启发了各种面部图像编辑方法，但由于存在诸如施加3D约束、保持身份一致性、确保时间连贯性等各种挑战，其中许多方法难以直接应用于视频编辑。为了应对这些挑战，我们提出了一个在StyleGAN2潜在空间上运行的新框架，用于在面部视频上进行身份感知和形状感知的编辑传播。为了减少保持身份、保持原始3D运动以及避免形状扭曲的困难，我们对人脸视频帧的StyleGAN2潜在向量进行解缠，以将外观、形状、表情和运动与身份解耦。一个编辑编码模块用于将一系列图像帧映射到具有3D参数控制的连续潜在代码，并通过身份损失和三重形状损失以自监督方式进行训练。我们的模型支持各种形式的编辑传播：一、在特定关键帧上直接进行外观编辑；二、通过给定参考图像对面部形状进行隐式编辑；三、现有的基于潜在的语义编辑。实验表明，我们的方法在各种自然视频中表现良好，并且优于基于动画的方法和最近的深度生成技术。

相似文献

Identity-Aware and Shape-Aware Propagation of Face Editing in Videos.视频中面部编辑的身份感知与形状感知传播

IEEE Trans Vis Comput Graph. 2024 Jul;30(7):3444-3456. doi: 10.1109/TVCG.2023.3235364. Epub 2024 Jun 27.

Talk-to-Edit: Fine-Grained 2D and 3D Facial Editing via Dialog.通过对话进行精细的二维和三维面部编辑：对话式编辑

IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3692-3706. doi: 10.1109/TPAMI.2023.3347299. Epub 2024 Apr 3.

Image-to-Image Translation With Disentangled Latent Vectors for Face Editing.用于面部编辑的具有解缠潜向量的图像到图像翻译

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14777-14788. doi: 10.1109/TPAMI.2023.3308102. Epub 2023 Nov 3.

MaskFaceGAN: High-Resolution Face Editing With Masked GAN Latent Code Optimization.MaskFaceGAN：基于掩码生成对抗网络潜在代码优化的高分辨率面部编辑

IEEE Trans Image Process. 2023;32:5893-5908. doi: 10.1109/TIP.2023.3326675. Epub 2023 Nov 1.

AttGAN: Facial Attribute Editing by Only Changing What You Want.AttGAN：仅通过改变你想要改变的内容来进行面部属性编辑。

IEEE Trans Image Process. 2019 Nov;28(11):5464-5478. doi: 10.1109/TIP.2019.2916751. Epub 2019 May 20.

F³A-GAN: Facial Flow for Face Animation With Generative Adversarial Networks.F³A-GAN：基于生成对抗网络的人脸动画的面部流

IEEE Trans Image Process. 2021;30:8658-8670. doi: 10.1109/TIP.2021.3112059. Epub 2021 Oct 21.

Identity preserving multi-pose facial expression recognition using fine tuned VGG on the latent space vector of generative adversarial network.基于生成对抗网络潜在空间向量的微调 VGG 进行身份保留多姿态面部表情识别。

Math Biosci Eng. 2021 Apr 28;18(4):3699-3717. doi: 10.3934/mbe.2021186.

FACEMUG: A Multimodal Generative and Fusion Framework for Local Facial Editing.FACEMUG：一种用于局部面部编辑的多模态生成与融合框架。

IEEE Trans Vis Comput Graph. 2024 Jul 26;PP. doi: 10.1109/TVCG.2024.3434386.

Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face Reconstruction.Fast-GANFIT：用于高保真 3D 人脸重建的生成对抗网络。

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4879-4893. doi: 10.1109/TPAMI.2021.3084524. Epub 2022 Aug 4.

DeepCloth: Neural Garment Representation for Shape and Style Editing.深度布料：用于形状和风格编辑的神经服装表示

IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1581-1593. doi: 10.1109/TPAMI.2022.3168569. Epub 2023 Jan 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验