Sub-R-Pa Chayanon, Chen Rung-Ching, Fan Ming-Zhong
Department of Information Management, Chaoyang University of Technology, Taichung, Taiwan.
PeerJ Comput Sci. 2024 Oct 25;10:e2438. doi: 10.7717/peerj-cs.2438. eCollection 2024.
Recent advancements in facial expression synthesis using deep learning, particularly with Cycle-Consistent Adversarial Networks (CycleGAN), have led to impressive results. However, a critical challenge persists: the generated expressions often lack the sharpness and fine details of the original face, such as freckles, moles, or birthmarks. To address this issue, we introduce the Facial Expression Morphing (FEM) algorithm, a novel post-processing method designed to enhance the visual fidelity of CycleGAN-based outputs. The FEM method blends the input image with the generated expression, prioritizing the preservation of crucial facial details. We experimented with our method on the Radboud Faces Database (RafD) and evaluated employing the Fréchet Inception Distance (FID) standard benchmark for image-to-image translation and introducing a new metric, FSD (Facial Similarity Distance), to specifically measure the similarity between translated and real images. Our comprehensive analysis of CycleGAN, UNet Vision Transformer cycle-consistent GAN versions 1 (UVCGANv1) and 2 (UVCGANv2) reveals a substantial enhancement in image clarity and preservation of intricate details. The average FID score of 31.92 achieved by our models represents a remarkable 50% reduction compared to the previous state-of-the-art model's score of 63.82, showcasing the significant advancements made in this domain. This substantial enhancement in image quality is further supported by our proposed FSD metric, which shows a closer resemblance between FEM-processed images and the original faces.
近年来,利用深度学习进行面部表情合成取得了显著进展,特别是使用循环一致对抗网络(CycleGAN),取得了令人瞩目的成果。然而,一个关键挑战仍然存在:生成的表情往往缺乏原始面部的清晰度和精细细节,如雀斑、痣或胎记。为了解决这个问题,我们引入了面部表情变形(FEM)算法,这是一种新颖的后处理方法,旨在提高基于CycleGAN的输出的视觉保真度。FEM方法将输入图像与生成的表情融合,优先保留关键面部细节。我们在拉德堡德面部数据库(RafD)上对我们的方法进行了实验,并采用弗雷歇因距离(FID)标准基准进行图像到图像翻译评估,并引入了一种新的度量标准,即面部相似性距离(FSD),以专门测量翻译图像与真实图像之间的相似性。我们对CycleGAN、UNet视觉Transformer循环一致GAN版本1(UVCGANv1)和2(UVCGANv2)的综合分析表明,在图像清晰度和复杂细节保留方面有了显著提高。我们的模型获得的平均FID分数为31.92,与之前的最先进模型的分数63.82相比,显著降低了50%,展示了该领域取得的重大进展。我们提出的FSD度量标准进一步支持了图像质量的显著提高,该标准表明经FEM处理的图像与原始面部更相似。