Sun Qiyu, Tang Yang, Zhang Chongzhen, Zhao Chaoqiang, Qian Feng, Kurths Jurgen
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2023-2033. doi: 10.1109/TNNLS.2021.3100895. Epub 2022 May 2.
Deep learning-based methods mymargin have achieved remarkable performance in 3-D sensing since they perceive environments in a biologically inspired manner. Nevertheless, the existing approaches trained by monocular sequences are still prone to fail in dynamic environments. In this work, we mitigate the negative influence of dynamic environments on the joint estimation of depth and visual odometry (VO) through hybrid masks. Since both the VO estimation and view reconstruction process in the joint estimation framework is vulnerable to dynamic environments, we propose the cover mask and the filter mask to alleviate the adverse effects, respectively. As the depth and VO estimation are tightly coupled during training, the improved VO estimation promotes depth estimation as well. Besides, a depth-pose consistency loss is proposed to overcome the scale inconsistency between different training samples of monocular sequences. Experimental results show that both our depth prediction and globally consistent VO estimation are state of the art when evaluated on the KITTI benchmark. We evaluate our depth prediction model on the Make3D dataset to prove the transferability of our method as well.
基于深度学习的方法mymargin在三维传感领域已经取得了显著的成果,因为它们以一种受生物启发的方式感知环境。然而,现有的由单目序列训练的方法在动态环境中仍然容易失败。在这项工作中,我们通过混合掩码减轻动态环境对深度和视觉里程计(VO)联合估计的负面影响。由于联合估计框架中的VO估计和视图重建过程都容易受到动态环境的影响,我们分别提出了覆盖掩码和滤波掩码来减轻不利影响。由于深度和VO估计在训练过程中紧密耦合,改进后的VO估计也促进了深度估计。此外,还提出了一种深度姿态一致性损失来克服单目序列不同训练样本之间的尺度不一致问题。实验结果表明,在KITTI基准测试中,我们的深度预测和全局一致的VO估计均达到了当前最优水平。我们还在Make3D数据集上评估了我们的深度预测模型,以证明我们方法的可迁移性。