Dept. of Electr. Eng. and Comput. Sci., California Univ., Berkeley, CA.
IEEE Trans Image Process. 1997;6(4):584-98. doi: 10.1109/83.563323.
This paper focuses on the representation and view generation of three-dimensional (3-D) scenes. In contrast to existing methods that construct a full 3-D model or those that exploit geometric invariants, our representation consists of dense depth maps at several preselected viewpoints from an image sequence. Furthermore, instead of using multiple calibrated stationary cameras or range scanners, we derive our depth maps from image sequences captured by an uncalibrated camera with only approximately known motion. We propose an adaptive matching algorithm that assigns various confidence levels to different regions in the depth maps. Nonuniform bicubic spline interpolation is then used to fill in low confidence regions in the depth maps. Once the depth maps are computed at preselected viewpoints, the intensity and depth at these locations are used to reconstruct arbitrary views of the 3-D scene. Specifically, the depth maps are regarded as vertices of a deformable 2-D mesh, which are transformed in 3-D, projected to 2-D, and rendered to generate the desired view. Experimental results are presented to verify our approach.
本文主要关注三维(3-D)场景的表示和视图生成。与现有的构建完整 3-D 模型的方法或利用几何不变量的方法不同,我们的表示由图像序列中几个预选视点的密集深度图组成。此外,我们不是使用多个经过校准的固定摄像机或距离扫描仪,而是从仅具有近似已知运动的未经校准的摄像机捕获的图像序列中推导出深度图。我们提出了一种自适应匹配算法,为深度图中的不同区域分配不同的置信度级别。然后,使用非均匀双三次样条插值来填充深度图中的低置信度区域。一旦在预选视点计算出深度图,就可以使用这些位置的强度和深度来重建 3-D 场景的任意视图。具体来说,将深度图视为可变形 2-D 网格的顶点,这些顶点在 3-D 中进行变换,投影到 2-D,并进行渲染以生成所需的视图。实验结果验证了我们的方法。