Chen Yajing, Wu Fanzi, Wang Zeyu, Song Yibing, Ling Yonggen, Bao Linchao
IEEE Trans Image Process. 2020 Aug 27;PP. doi: 10.1109/TIP.2020.3017347.
In this paper, we present an end-to-end learning framework for detailed 3D face reconstruction from a single image1. Our approach uses a 3DMM-based coarse model and a displacement map in UV-space to represent a 3D face. Unlike previous work addressing the problem, our learning framework does not require supervision of surrogate ground-truth 3D models computed with traditional approaches. Instead, we utilize the input image itself as supervision during learning. In the first stage, we combine a photometric loss and a facial perceptual loss between the input face and the rendered face, to regress a 3DMM-based coarse model. In the second stage, both the input image and the regressed texture of the coarse model are unwrapped into UV-space, and then sent through an image-toimage translation network to predict a displacement map in UVspace. The displacement map and the coarse model are used to render a final detailed face, which again can be compared with the original input image to serve as a photometric loss for the second stage. The advantage of learning displacement map in UV-space is that face alignment can be explicitly done during the unwrapping, thus facial details are easier to learn from large amount of data. Extensive experiments demonstrate the superiority of the proposed method over previous work.
在本文中,我们提出了一种用于从单张图像进行详细3D面部重建的端到端学习框架。我们的方法使用基于3DMM的粗模型和UV空间中的位移图来表示3D面部。与之前解决该问题的工作不同,我们的学习框架不需要对用传统方法计算的替代真实3D模型进行监督。相反,我们在学习过程中利用输入图像本身作为监督。在第一阶段,我们结合输入面部与渲染面部之间的光度损失和面部感知损失,以回归基于3DMM的粗模型。在第二阶段,输入图像和粗模型的回归纹理都被展开到UV空间,然后通过图像到图像的翻译网络来预测UV空间中的位移图。位移图和粗模型用于渲染最终的详细面部,该面部又可以与原始输入图像进行比较,以作为第二阶段的光度损失。在UV空间中学习位移图的优点是,在展开过程中可以明确地进行面部对齐,因此面部细节更容易从大量数据中学习。大量实验证明了所提出方法相对于先前工作的优越性。