Aleotti Filippo, Zaccaroni Giulio, Bartolomei Luca, Poggi Matteo, Tosi Fabio, Mattoccia Stefano
Department of Computer Science and Engineering, University of Bologna, 40136 Bologna, Italy.
Sensors (Basel). 2020 Dec 22;21(1):15. doi: 10.3390/s21010015.
Depth perception is paramount for tackling real-world problems, ranging from autonomous driving to consumer applications. For the latter, depth estimation from a single image would represent the most versatile solution since a standard camera is available on almost any handheld device. Nonetheless, two main issues limit the practical deployment of monocular depth estimation methods on such devices: (i) the low reliability when deployed in the wild and (ii) the resources needed to achieve real-time performance, often not compatible with low-power embedded systems. Therefore, in this paper, we deeply investigate all these issues, showing how they are both addressable by adopting appropriate network design and training strategies. Moreover, we also outline how to map the resulting networks on handheld devices to achieve real-time performance. Our thorough evaluation highlights the ability of such fast networks to generalize well to new environments, a crucial feature required to tackle the extremely varied contexts faced in real applications. Indeed, to further support this evidence, we report experimental results concerning real-time, depth-aware augmented reality and image blurring with smartphones in the wild.
深度感知对于解决现实世界中的问题至关重要,从自动驾驶到消费应用领域皆是如此。对于后者而言,单图像深度估计将是最通用的解决方案,因为几乎任何手持设备上都配备有标准摄像头。尽管如此,有两个主要问题限制了单目深度估计方法在此类设备上的实际应用:(i)在实际场景中部署时可靠性较低;(ii)实现实时性能所需的资源,这通常与低功耗嵌入式系统不兼容。因此,在本文中,我们深入研究了所有这些问题,展示了如何通过采用适当的网络设计和训练策略来解决它们。此外,我们还概述了如何在手持设备上映射生成的网络以实现实时性能。我们全面的评估突出了此类快速网络能够很好地推广到新环境的能力,这是应对实际应用中面临的极其多样的场景所需的关键特性。事实上,为了进一步支持这一证据,我们报告了在实际场景中使用智能手机进行实时、深度感知增强现实和图像模糊处理的实验结果。