Cong Runmin, Wu Chunlei, Song Xibin, Zhang Wei, Kwong Sam, Li Hongdong, Ji Pan
IEEE Trans Image Process. 2024;33:5538-5550. doi: 10.1109/TIP.2024.3465034. Epub 2024 Oct 4.
Deep CNNs have achieved impressive improvements for night-time self-supervised depth estimation form a monocular image. However, the performance degrades considerably compared to day-time depth estimation due to significant domain gaps, low visibility, and varying illuminations between day and night images. To address these challenges, we propose a novel night-time self-supervised monocular depth estimation framework with structure regularization, i.e., SRNSD, which incorporates three aspects of constraints for better performance, including feature and depth domain adaptation, image perspective constraint, and cropped multi-scale consistency loss. Specifically, we utilize adaptations of both feature and depth output spaces for better night-time feature extraction and depth map prediction, along with high- and low-frequency decoupling operations for better depth structure and texture recovery. Meanwhile, we employ an image perspective constraint to enhance the smoothness and obtain better depth maps in areas where the luminosity jumps change. Furthermore, we introduce a simple yet effective cropped multi-scale consistency loss that utilizes consistency among different scales of depth outputs for further optimization, refining the detailed textures and structures of predicted depth. Experimental results on different benchmarks with depth ranges of 40m and 60m, including Oxford RobotCar dataset, nuScenes dataset and CARLA-EPE dataset, demonstrate the superiority of our approach over state-of-the-art night-time self-supervised depth estimation approaches across multiple metrics, proving our effectiveness.
深度卷积神经网络(Deep CNNs)在从单目图像进行夜间自监督深度估计方面取得了令人瞩目的进展。然而,由于白天和夜间图像之间存在显著的领域差距、低能见度和光照变化,与白天深度估计相比,其性能会大幅下降。为了应对这些挑战,我们提出了一种具有结构正则化的新型夜间自监督单目深度估计框架,即SRNSD,它纳入了三个方面的约束以实现更好的性能,包括特征和深度领域自适应、图像视角约束以及裁剪后的多尺度一致性损失。具体而言,我们利用特征和深度输出空间的自适应来进行更好的夜间特征提取和深度图预测,同时进行高频和低频解耦操作以更好地恢复深度结构和纹理。此外,我们采用图像视角约束来增强平滑度,并在亮度跳跃变化的区域获得更好的深度图。此外,我们引入了一种简单而有效的裁剪后的多尺度一致性损失,该损失利用不同尺度深度输出之间的一致性进行进一步优化,细化预测深度的详细纹理和结构。在包括牛津机器人汽车数据集、nuScenes数据集和CARLA-EPE数据集在内的深度范围为40米和60米的不同基准上的实验结果表明,我们的方法在多个指标上优于当前最先进的夜间自监督深度估计方法,证明了我们方法的有效性。