Suppr超能文献

基于学习的 2D 到 3D 图像和视频自动转换。

Learning-based, automatic 2D-to-3D image and video conversion.

机构信息

Department of Electrical and Computer Engineering, Boston University, Boston, MA 02215, USA.

出版信息

IEEE Trans Image Process. 2013 Sep;22(9):3485-96. doi: 10.1109/TIP.2013.2270375. Epub 2013 Jun 20.

Abstract

Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.

摘要

尽管在过去几年中得到了显著发展,但 3D 内容的可用性仍然远远落后于其 2D 内容。为了缩小这一差距,已经提出了许多 2D 到 3D 的图像和视频转换方法。涉及人工操作员的方法最为成功,但也耗时且昂贵。自动方法通常利用确定性的 3D 场景模型,但尚未达到相同的质量水平,因为它们依赖于在实践中经常被违反的假设。在本文中,我们提出了一类新的方法,这些方法基于从示例中学习 2D 到 3D 转换的根本不同的方法。我们开发了两种类型的方法。第一种方法基于学习从局部图像/视频属性(例如颜色、空间位置,并且在视频的情况下,每个像素的运动)到该像素的场景深度的点映射,使用回归类型的想法。第二种方法基于使用最近邻回归类型的想法,直接从 3D 图像(图像+深度对或立体对)的存储库全局估计查询图像的整个深度图。我们在许多 2D 图像上展示了我们的方法的有效性和计算效率,并讨论了它们的优缺点。尽管还远非完美,但我们的结果表明,可以使用 3D 内容的存储库有效地进行 2D 到 3D 的图像转换。通过强制计算的深度图的时间连续性,很容易扩展到视频。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验