IEEE Trans Pattern Anal Mach Intell. 2016 Apr;38(4):690-703. doi: 10.1109/TPAMI.2015.2439286.
In this paper, we present a technique for recovering a model of shape, illumination, reflectance, and shading from a single image taken from an RGB-D sensor. To do this, we extend the SIRFS ("shape, illumination and reflectance from shading") model, which recovers intrinsic scene properties from a single image. Though SIRFS works well on neatly segmented images of objects, it performs poorly on images of natural scenes which often contain occlusion and spatially-varying illumination. We therefore present Scene-SIRFS, a generalization of SIRFS in which we model a scene using a mixture of shapes and a mixture of illuminations, where those mixture components are embedded in a "soft" segmentation-like representation of the input image. We use the noisy depth maps provided by RGB-D sensors (such as the Microsoft Kinect) to guide and improve shape estimation. Our model takes as input a single RGB-D image and produces as output an improved depth map, a set of surface normals, a reflectance image, a shading image, and a spatially varying model of illumination. The output of our model can be used for graphics applications such as relighting and retargeting, or for more broad applications (recognition, segmentation) involving RGB-D images.
在本文中,我们提出了一种从 RGB-D 传感器拍摄的单张图像中恢复形状、光照、反射率和阴影模型的技术。为此,我们扩展了 SIRFS(“从阴影恢复形状、光照和反射率”)模型,该模型可以从单张图像中恢复内在场景属性。尽管 SIRFS 在整齐分割的物体图像上表现良好,但在自然场景图像上表现不佳,这些图像通常包含遮挡和空间变化的光照。因此,我们提出了 Scene-SIRFS,这是 SIRFS 的一种推广,我们使用形状和光照的混合来对场景进行建模,其中那些混合分量被嵌入到输入图像的“软”分割样表示中。我们使用 RGB-D 传感器(如 Microsoft Kinect)提供的有噪声深度图来指导和改进形状估计。我们的模型以单张 RGB-D 图像作为输入,并输出改进后的深度图、一组表面法线、反射率图像、阴影图像和空间变化的光照模型。我们模型的输出可用于图形应用程序,如重光照和重定向,或用于涉及 RGB-D 图像的更广泛的应用程序(识别、分割)。