Chen Songnan, Tang Mengxia, Kan Jiangming
J Opt Soc Am A Opt Image Sci Vis. 2019 Oct 1;36(10):1709-1718. doi: 10.1364/JOSAA.36.001709.
We propose an encoder-decoder with densely convolutional networks model to recover the depth information from a single RGB image without the need for depth sensors. The encoder part serves to extract the most representative information from the original data through a series of convolution operations and to reduce the resolution of the spatial input feature. We use the decoder section to produce an upsampling structure that improves the output resolution. Our model is trained from scratch, without any special tuning process, and uses a new optimization function to adaptively learn the rate. We demonstrate the effectiveness of the method by evaluating both indoor and outdoor scenes, and the experimental results show that our proposed approach is more accurate than competing methods.
我们提出了一种带有密集卷积网络模型的编码器 - 解码器,用于从单张RGB图像中恢复深度信息,而无需深度传感器。编码器部分通过一系列卷积操作从原始数据中提取最具代表性的信息,并降低空间输入特征的分辨率。我们使用解码器部分来生成一个上采样结构,以提高输出分辨率。我们的模型从零开始训练,无需任何特殊的调优过程,并使用一种新的优化函数来自适应地学习速率。我们通过评估室内和室外场景来证明该方法的有效性,实验结果表明,我们提出的方法比其他竞争方法更准确。