IEEE Trans Cybern. 2022 Oct;52(10):10750-10760. doi: 10.1109/TCYB.2021.3064089. Epub 2022 Sep 19.
Joint detection of drivable areas and road anomalies is very important for mobile robots. Recently, many semantic segmentation approaches based on convolutional neural networks (CNNs) have been proposed for pixelwise drivable area and road anomaly detection. In addition, some benchmark datasets, such as KITTI and Cityscapes, have been widely used. However, the existing benchmarks are mostly designed for self-driving cars. There lacks a benchmark for ground mobile robots, such as robotic wheelchairs. Therefore, in this article, we first build a drivable area and road anomaly detection benchmark for ground mobile robots, evaluating existing state-of-the-art (SOTA) single-modal and data-fusion semantic segmentation CNNs using six modalities of visual features. Furthermore, we propose a novel module, referred to as the dynamic fusion module (DFM), which can be easily deployed in existing data-fusion networks to fuse different types of visual features effectively and efficiently. The experimental results show that the transformed disparity image is the most informative visual feature and the proposed DFM-RTFNet outperforms the SOTAs. In addition, our DFM-RTFNet achieves competitive performance on the KITTI road benchmark.
可行驶区域和道路异常的联合检测对于移动机器人非常重要。最近,许多基于卷积神经网络(CNN)的语义分割方法已经被提出,用于像素级的可行驶区域和道路异常检测。此外,一些基准数据集,如 KITTI 和 Cityscapes,已经被广泛使用。然而,现有的基准大多是为自动驾驶汽车设计的,缺乏针对地面移动机器人(如机器人轮椅)的基准。因此,在本文中,我们首先为地面移动机器人构建了一个可行驶区域和道路异常检测基准,使用六种视觉特征模态评估现有的最先进(SOTA)单模态和数据融合语义分割 CNN。此外,我们提出了一种新的模块,称为动态融合模块(DFM),它可以很容易地部署在现有的数据融合网络中,有效地融合不同类型的视觉特征。实验结果表明,转换后的视差图像是最具信息量的视觉特征,所提出的 DFM-RTFNet 优于 SOTA。此外,我们的 DFM-RTFNet 在 KITTI 道路基准上也取得了有竞争力的性能。