Suppr超能文献

基于动画的在线课堂环境中的面部图像操作的实验研究。

An experimental study of animating-based facial image manipulation in online class environments.

机构信息

Graduate School of Data Science, Seoul National University of Science and Technology, Seoul, South Korea.

Department of Industrial Engineering, Seoul National University of Science and Technology, Seoul, South Korea.

出版信息

Sci Rep. 2023 Mar 22;13(1):4667. doi: 10.1038/s41598-023-31408-y.

Abstract

Recent advances in artificial intelligence technology have significantly improved facial image manipulation, which is known as Deepfake. Facial image manipulation synthesizes or replaces a region of the face in an image with that of another face. The techniques for facial image manipulation are classified into four categories: (1) entire face synthesis, (2) identity swap, (3) attribute manipulation, and (4) expression swap. Out of them, we focus on expression swap because it effectively manipulates only the expression of the face in the images or videos without creating or replacing the entire face, having advantages for the real-time application. In this study, we propose an evaluation framework of the expression swap models targeting the real-time online class environments. For this, we define three kinds of scenarios according to the portion of the face in the entire image considering actual online class situations: (1) attendance check (Scenario 1), (2) presentation (Scenario 2), and (3) examination (Scenario 3). Considering the manipulation on the online class environments, the framework receives a single source image and a target video and generates the video that manipulates a face of the target video to that in the source image. To this end, we select two models that satisfy the conditions required by the framework: (1) first order model and (2) GANimation. We implement these models in the framework and evaluate their performance for the defined scenarios. Through the quantitative and qualitative evaluation, we observe distinguishing properties of the used two models. Specifically, both models show acceptable results in Scenario 1, where the face occupies a large portion of the image. However, their performances are significantly degraded in Scenarios 2 and 3, where the face occupies less portion of the image; the first order model causes relatively less loss of image quality than GANimation in the result of the quantitative evaluation. In contrast, GANimation has the advantages of representing facial expression changes compared to the first order model. Finally, we devise an architecture for applying the expression swap model to the online video conferencing application in real-time. In particular, by applying the expression swap model to widely used online meeting platforms such as Zoom, Google Meet, and Microsoft Teams, we demonstrate its feasibility for real-time online classes.

摘要

最近,人工智能技术的进步显著提高了人脸图像操纵技术,即 Deepfake。人脸图像操纵通过将图像中某一区域的人脸替换为另一张人脸来实现。人脸图像操纵技术主要分为四类:(1)整体面部合成,(2)身份交换,(3)属性操作,(4)表情交换。其中,我们重点关注表情交换,因为它可以有效地仅操纵图像或视频中的面部表情,而无需创建或替换整个面部,这对于实时应用具有优势。在这项研究中,我们针对实时在线课堂环境提出了一种表情交换模型的评估框架。为此,我们根据考虑实际在线课堂情况的整个图像中的面部部分定义了三种情况:(1)出勤检查(场景 1),(2)演示(场景 2)和(3)考试(场景 3)。考虑到对在线课堂环境的操纵,该框架接收单张源图像和目标视频,并生成操纵目标视频中的面部以使其与源图像中的面部相同的视频。为此,我们选择了两个满足框架要求的模型:(1)一阶模型和(2)GANimation。我们在框架中实现了这些模型,并针对定义的场景评估了它们的性能。通过定量和定性评估,我们观察到所使用的两个模型的区别特征。具体来说,两个模型在场景 1 中表现良好,其中面部占据图像的大部分。但是,在场景 2 和 3 中,当面部占据图像的较小部分时,它们的性能会显著下降;在定量评估的结果中,一阶模型导致的图像质量损失相对较小。相比之下,GANimation 具有比一阶模型更好地表示面部表情变化的优势。最后,我们设计了一种架构,可将表情交换模型实时应用于在线视频会议应用程序中。特别是,通过将表情交换模型应用于广泛使用的在线会议平台,如 Zoom、Google Meet 和 Microsoft Teams,我们展示了其在实时在线课程中的可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dec0/10033672/07aeb01d2784/41598_2023_31408_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验