Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei 10607, Taiwan.
Software Technology Department, De La Salle University, Manila 1004, Philippines.
Sensors (Basel). 2019 Apr 10;19(7):1708. doi: 10.3390/s19071708.
Depth has been a valuable piece of information for perception tasks such as robot grasping, obstacle avoidance, and navigation, which are essential tasks for developing smart homes and smart cities. However, not all applications have the luxury of using depth sensors or multiple cameras to obtain depth information. In this paper, we tackle the problem of estimating the per-pixel depths from a single image. Inspired by the recent works on generative neural network models, we formulate the task of depth estimation as a generative task where we synthesize an image of the depth map from a single Red, Green, and Blue (RGB) input image. We propose a novel generative adversarial network that has an encoder-decoder type generator with residual transposed convolution blocks trained with an adversarial loss. Quantitative and qualitative experimental results demonstrate the effectiveness of our approach over several depth estimation works.
深度一直是机器人抓取、障碍物回避和导航等感知任务的重要信息,这些任务对于开发智能家居和智慧城市至关重要。然而,并非所有应用都有使用深度传感器或多个相机来获取深度信息的奢侈条件。在本文中,我们解决了从单张图像估计逐像素深度的问题。受最近关于生成神经网模型的工作的启发,我们将深度估计任务表述为生成任务,即从单个红绿蓝(RGB)输入图像合成深度图的图像。我们提出了一种新颖的生成对抗网络,它具有一个带有残差转置卷积块的编解码器类型生成器,使用对抗性损失进行训练。定量和定性实验结果表明,我们的方法在多个深度估计工作中都具有有效性。