Zhang Qi, Yu Wentao, Liu Weirong, Xu Hao, He Yuan
School of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha 410018, China.
School of Computer, Central South University, Changsha 410083, China.
Sensors (Basel). 2023 Nov 19;23(22):9274. doi: 10.3390/s23229274.
Currently, in most traditional VSLAM (visual SLAM) systems, static assumptions result in a low accuracy in dynamic environments, or result in a new and higher level of accuracy but at the cost of sacrificing the real-time property. In highly dynamic scenes, balancing a high accuracy and a low computational cost has become a pivotal requirement for VSLAM systems. This paper proposes a new VSLAM system, balancing the competitive demands between positioning accuracy and computational complexity and thereby further improving the overall system properties. From the perspective of accuracy, the system applies an improved lightweight target detection network to quickly detect dynamic feature points while extracting feature points at the front end of the system, and only feature points of static targets are applied for frame matching. Meanwhile, the attention mechanism is integrated into the target detection network to continuously and accurately capture dynamic factors to cope with more complex dynamic environments. From the perspective of computational expense, the lightweight network Ghostnet module is applied as the backbone network of the target detection network YOLOv5s, significantly reducing the number of model parameters and improving the overall inference speed of the algorithm. Experimental results on the TUM dynamic dataset indicate that in contrast with the ORB-SLAM3 system, the pose estimation accuracy of the system improved by 84.04%. In contrast with dynamic SLAM systems such as DS-SLAM and DVO SLAM, the system has a significantly improved positioning accuracy. In contrast with other VSLAM algorithms based on deep learning, the system has superior real-time properties while maintaining a similar accuracy index.
目前,在大多数传统视觉同步定位与地图构建(VSLAM)系统中,静态假设在动态环境下会导致精度较低,或者虽能带来新的更高精度水平,但要以牺牲实时性为代价。在高度动态的场景中,平衡高精度和低计算成本已成为VSLAM系统的关键要求。本文提出了一种新的VSLAM系统,平衡了定位精度和计算复杂度之间的竞争需求,从而进一步提升了整体系统性能。从精度角度来看,该系统应用了改进的轻量级目标检测网络,在系统前端提取特征点的同时快速检测动态特征点,并且仅将静态目标的特征点用于帧匹配。同时,将注意力机制集成到目标检测网络中,以持续准确地捕捉动态因素,从而应对更复杂的动态环境。从计算开销角度来看,轻量级网络Ghostnet模块被用作目标检测网络YOLOv5s的主干网络,显著减少了模型参数数量,提高了算法的整体推理速度。在TUM动态数据集上的实验结果表明,与ORB-SLAM3系统相比,该系统的位姿估计精度提高了84.04%。与DS-SLAM和DVO SLAM等动态SLAM系统相比,该系统的定位精度有显著提高。与其他基于深度学习的VSLAM算法相比,该系统在保持相似精度指标的同时具有卓越的实时性能。