IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2144-58. doi: 10.1109/TPAMI.2014.2316835.
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large data set containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.
我们描述了一种使用非参数深度采样从视频中自动生成逼真深度图的技术。我们在过去的方法失败的情况下(非平移相机和动态场景)展示了我们的技术。我们的技术适用于单张图像和视频。对于视频,我们使用局部运动线索来改进推断的深度图,同时使用光流来确保时间深度一致性。对于训练和评估,我们使用基于 Kinect 的系统来收集包含已知深度的立体视频的大型数据集。我们表明,我们的深度估计技术在基准数据库上优于最先进的技术。我们的技术可用于自动将单目视频转换为立体视频以进行 3D 可视化,我们通过各种室内和室外场景的令人愉悦的结果展示了这一点,包括来自特征电影 Charade 的结果。