Zhu Yuan, An Hao, Wang Huaide, Xu Ruidong, Sun Zhipeng, Lu Ke
School of Automotive Studies, Tongji University, Shanghai 201800, China.
Nanchang Automotive Institute of Intelligence & New Energy, Tongji University, Nanchang 330052, China.
Sensors (Basel). 2024 Jul 18;24(14):4676. doi: 10.3390/s24144676.
Most visual simultaneous localization and mapping (SLAM) systems are based on the assumption of a static environment in autonomous vehicles. However, when dynamic objects, particularly vehicles, occupy a large portion of the image, the localization accuracy of the system decreases significantly. To mitigate this challenge, this paper unveils DOT-SLAM, a novel stereo visual SLAM system that integrates dynamic object tracking through graph optimization. By integrating dynamic object pose estimation into the SLAM system, the system can effectively utilize both foreground and background points for ego vehicle localization and obtain a static feature points map. To rectify the inaccuracies in depth estimation from stereo disparity directly on the foreground points of dynamic objects due to their self-similarity characteristics, a coarse-to-fine depth estimation method based on camera-road plane geometry is presented. This method uses rough depth to guide fine stereo matching, thereby obtaining the 3 dimensions (3D)spatial positions of feature points on dynamic objects. Subsequently, by establishing constraints on the dynamic object's pose using the road plane and non-holonomic constraints (NHCs) of the vehicle, reducing the initial pose uncertainty of dynamic objects leads to more accurate dynamic object initialization. Finally, by considering foreground points, background points, the local road plane, the ego vehicle pose, and dynamic object poses as optimization nodes, through the establishment and joint optimization of a nonlinear model based on graph optimization, accurate six degrees of freedom (DoFs) pose estimations are obtained for both the ego vehicle and dynamic objects. Experimental validation on the KITTI-360 dataset demonstrates that DOT-SLAM effectively utilizes features from the background and dynamic objects in the environment, resulting in more accurate vehicle trajectory estimation and a static environment map. Results obtained from a real-world dataset test reinforce the effectiveness.
大多数视觉同步定位与地图构建(SLAM)系统是基于自动驾驶车辆所处静态环境这一假设的。然而,当动态物体,尤其是车辆占据图像的很大一部分时,系统的定位精度会显著下降。为了应对这一挑战,本文推出了DOT-SLAM,这是一种新颖的立体视觉SLAM系统,它通过图优化集成了动态物体跟踪功能。通过将动态物体姿态估计集成到SLAM系统中,该系统能够有效地利用前景和背景点进行自车定位,并获得静态特征点地图。由于动态物体的自相似特性,直接在其前景点上从立体视差进行深度估计会存在不准确的问题,为此提出了一种基于相机-道路平面几何的粗到精深度估计方法。该方法使用粗略深度来引导精细的立体匹配,从而获得动态物体上特征点的三维空间位置。随后,通过利用道路平面和车辆的非完整约束(NHC)对动态物体的姿态建立约束,减少动态物体的初始姿态不确定性可实现更准确的动态物体初始化。最后,通过将前景点、背景点、局部道路平面、自车姿态和动态物体姿态视为优化节点,通过建立并联合优化基于图优化的非线性模型,可获得自车和动态物体精确的六自由度(DoF)姿态估计。在KITTI-360数据集上的实验验证表明,DOT-SLAM有效地利用了环境中的背景和动态物体特征,从而实现了更准确的车辆轨迹估计和静态环境地图。从真实世界数据集测试中获得的结果进一步证实了其有效性。