Li Chao, Hu Yang, Liu Jianqiang, Jin Jianhai, Sun Jun
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China.
China Ship Scientific Research Center, Wuxi, China.
Front Neurorobot. 2024 Nov 20;18:1473937. doi: 10.3389/fnbot.2024.1473937. eCollection 2024.
Simultaneous Localization and Mapping (SLAM) is a technology used in intelligent systems such as robots and autonomous vehicles. Visual SLAM has become a more popular type of SLAM due to its acceptable cost and good scalability when applied in robot positioning, navigation and other functions. However, most of the visual SLAM algorithms assume a static environment, so when they are implemented in highly dynamic scenes, problems such as tracking failure and overlapped mapping are prone to occur.
To deal with this issue, we propose ISFM-SLAM, a dynamic visual SLAM built upon the classic ORB-SLAM2, incorporating an improved instance segmentation network and enhanced feature matching. Based on YOLACT, the improved instance segmentation network applies the multi-scale residual network Res2Net as its backbone, and utilizes CIoU_Loss in the bounding box loss function, to enhance the detection accuracy of the segmentation network. To improve the matching rate and calculation efficiency of the internal feature points, we fuse ORB key points with an efficient image descriptor to replace traditional ORB feature matching of ORB-SLAM2. Moreover, the motion consistency detection algorithm based on external variance values is proposed and integrated into ISFM-SLAM, to assist the proposed SLAM systems in culling dynamic feature points more effectively.
Simulation results on the TUM dataset show that the overall pose estimation accuracy of the ISFM-SLAM is 97% better than the ORB-SLAM2, and is superior to other mainstream and state-of-the-art dynamic SLAM systems. Further real-world experiments validate the feasibility of the proposed SLAM system in practical applications.
同步定位与地图构建(SLAM)是一种应用于机器人和自动驾驶车辆等智能系统的技术。视觉SLAM因其在机器人定位、导航和其他功能中应用时成本可接受且扩展性良好,已成为一种更受欢迎的SLAM类型。然而,大多数视觉SLAM算法都假设环境是静态的,因此当它们在高度动态的场景中实现时,容易出现跟踪失败和地图重叠等问题。
为了解决这个问题,我们提出了ISFM-SLAM,这是一种基于经典ORB-SLAM2构建的动态视觉SLAM,它结合了改进的实例分割网络和增强的特征匹配。改进的实例分割网络基于YOLACT,将多尺度残差网络Res2Net用作其主干,并在边界框损失函数中使用CIoU_Loss来提高分割网络的检测精度。为了提高内部特征点的匹配率和计算效率,我们将ORB关键点与高效的图像描述符融合,以取代ORB-SLAM2传统的ORB特征匹配。此外,还提出了基于外部方差值的运动一致性检测算法并将其集成到ISFM-SLAM中,以帮助所提出的SLAM系统更有效地剔除动态特征点。
在TUM数据集上的仿真结果表明,ISFM-SLAM的整体姿态估计精度比ORB-SLAM2高97%,并且优于其他主流和最新的动态SLAM系统。进一步的实际实验验证了所提出的SLAM系统在实际应用中的可行性。