Sengupta Soumyadip, Lichy Daniel, Kanazawa Angjoo, Castillo Carlos D, Jacobs David W
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3272-3284. doi: 10.1109/TPAMI.2020.3046915. Epub 2022 May 5.
We present SfSNet, an end-to-end learning framework for producing an accurate decomposition of an unconstrained human face image into shape, reflectance and illuminance. SfSNet is designed to reflect a physical lambertian rendering model. SfSNet learns from a mixture of labeled synthetic and unlabeled real-world images. This allows the network to capture low-frequency variations from synthetic and high-frequency details from real images through the photometric reconstruction loss. SfSNet consists of a new decomposition architecture with residual blocks that learns a complete separation of albedo and normal. This is used along with the original image to predict lighting. SfSNet produces significantly better quantitative and qualitative results than state-of-the-art methods for inverse rendering and independent normal and illumination estimation. We also introduce a companion network, SfSMesh, that utilizes normals estimated by SfSNet to reconstruct a 3D face mesh. We demonstrate that SfSMesh produces face meshes with greater accuracy than state-of-the-art methods on real-world images.
我们提出了SfSNet,这是一个端到端的学习框架,用于将无约束的人脸图像准确分解为形状、反射率和光照。SfSNet旨在反映物理朗伯渲染模型。SfSNet从标记的合成图像和未标记的真实世界图像的混合数据中学习。这使得网络能够通过光度重建损失从合成图像中捕获低频变化,并从真实图像中捕获高频细节。SfSNet由一个带有残差块的新分解架构组成,该架构学习反照率和法线的完全分离。这与原始图像一起用于预测光照。与最先进的逆渲染以及独立法线和光照估计方法相比,SfSNet产生了显著更好的定量和定性结果。我们还引入了一个配套网络SfSMesh,它利用SfSNet估计的法线来重建3D人脸网格。我们证明,在真实世界图像上,SfSMesh生成的人脸网格比最先进的方法具有更高的准确性。