Hu Junjie, Fan Chenyou, Zhou Liguang, Gao Qing, Liu Honghai, Lam Tin Lun
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):796-806. doi: 10.1109/TNNLS.2023.3323487. Epub 2025 Jan 7.
With the rapid advancements in autonomous driving and robot navigation, there is a growing demand for lifelong learning (LL) models capable of estimating metric (absolute) depth. LL approaches potentially offer significant cost savings in terms of model training, data storage, and collection. However, the quality of RGB images and depth maps is sensor-dependent, and depth maps in the real world exhibit domain-specific characteristics, leading to variations in depth ranges. These challenges limit existing methods to LL scenarios with small domain gaps and relative depth map estimation. To facilitate lifelong metric depth learning, we identify three crucial technical challenges that require attention: 1) developing a model capable of addressing the depth scale variation through scale-aware depth learning; 2) devising an effective learning strategy to handle significant domain gaps; and 3) creating an automated solution for domain-aware depth inference in practical applications. Based on the aforementioned considerations, in this article, we present 1) a lightweight multihead framework that effectively tackles the depth scale imbalance; 2) an uncertainty-aware LL solution that adeptly handles significant domain gaps; and 3) an online domain-specific predictor selection method for real-time inference. Through extensive numerical studies, we show that the proposed method can achieve good efficiency, stability, and plasticity, leading the benchmarks by 8%-15%. The code is available at https://github.com/FreeformRobotics/Lifelong-MonoDepth.
随着自动驾驶和机器人导航技术的飞速发展,对能够估计度量(绝对)深度的终身学习(LL)模型的需求日益增长。LL方法在模型训练、数据存储和收集方面可能会显著节省成本。然而,RGB图像和深度图的质量取决于传感器,并且现实世界中的深度图具有特定领域的特征,导致深度范围存在差异。这些挑战将现有方法限制在具有小领域差距和相对深度图估计的LL场景中。为了促进终身度量深度学习,我们确定了三个需要关注的关键技术挑战:1)开发一种能够通过尺度感知深度学习解决深度尺度变化的模型;2)设计一种有效的学习策略来处理显著的领域差距;3)为实际应用中的领域感知深度推理创建一种自动化解决方案。基于上述考虑,在本文中,我们提出了1)一种有效地解决深度尺度不平衡问题的轻量级多头框架;2)一种能够巧妙处理显著领域差距的不确定性感知LL解决方案;3)一种用于实时推理的在线特定领域预测器选择方法。通过广泛的数值研究,我们表明所提出的方法能够实现良好的效率、稳定性和可塑性,比基准领先8%-15%。代码可在https://github.com/FreeformRobotics/Lifelong-MonoDepth获取。