Pumarola Albert, Agudo Antonio, Martinez Aleix M, Sanfeliu Alberto, Moreno-Noguer Francesc
Institut de Robòtica i Informàtica Industrial, CSIC-UPC, 08028, Barcelona, Spain.
The Ohio State University, Columbus, OH 43210, USA.
Comput Vis ECCV. 2018 Sep;11214:835-851. doi: 10.1007/978-3-030-01249-6_50. Epub 2018 Oct 6.
Recent advances in Generative Adversarial Networks (GANs) have shown impressive results for task of facial expression synthesis. The most successful architecture is StarGAN [4], that conditions GANs' generation process with images of a specific domain, namely a set of images of persons sharing the same expression. While effective, this approach can only generate a discrete number of expressions, determined by the content of the dataset. To address this limitation, in this paper, we introduce a novel GAN conditioning scheme based on Action Units (AU) annotations, which describes in a continuous manifold the anatomical facial movements defining a human expression. Our approach allows controlling the magnitude of activation of each AU and combine several of them. Additionally, we propose a fully unsupervised strategy to train the model, that only requires images annotated with their activated AUs, and exploit attention mechanisms that make our network robust to changing backgrounds and lighting conditions. Extensive evaluation show that our approach goes beyond competing conditional generators both in the capability to synthesize a much wider range of expressions ruled by anatomically feasible muscle movements, as in the capacity of dealing with images in the wild.
生成对抗网络(GAN)的最新进展在面部表情合成任务中取得了令人瞩目的成果。最成功的架构是StarGAN [4],它通过特定领域的图像(即一组具有相同表情的人物图像)来调节GAN的生成过程。虽然这种方法有效,但它只能生成由数据集内容决定的离散数量的表情。为了解决这一局限性,在本文中,我们引入了一种基于动作单元(AU)注释的新型GAN调节方案,该方案在连续流形中描述了定义人类表情的解剖学面部运动。我们的方法允许控制每个AU的激活幅度并将其中几个结合起来。此外,我们提出了一种完全无监督的策略来训练模型,该策略仅需要用激活的AU进行注释的图像,并利用注意力机制使我们的网络对变化的背景和光照条件具有鲁棒性。广泛的评估表明,我们的方法在合成由解剖学上可行的肌肉运动所支配的更广泛表情的能力方面,以及在处理自然图像的能力方面,都超越了竞争的条件生成器。