Yang Dianyu, Cheng Chensheng, Wang Can, Pan Guang, Zhang Feihu
School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an, China.
Front Neurorobot. 2022 Jul 19;16:928206. doi: 10.3389/fnbot.2022.928206. eCollection 2022.
The AUV (Autonomous Underwater Vehicle) navigation process relies on the interaction of a variety of sensors. The side-scan sonar can collect underwater images and obtain semantic underwater environment information after processing, which will help improve the ability of AUV autonomous navigation. However, there is no practical method to utilize the semantic information of side scan sonar image. A new convolutional neural network model is proposed to solve this problem in this paper. The model is a standard codec structure, which extracts multi-channel features from the input image and then fuses them to reduce parameters and strengthen the weight of feature channels. Then, a larger convolution kernel is used to extract the features of large-scale sonar images more effectively. Finally, a parallel compensation link with a small-scale convolution kernel is added and spliced with features extracted from a large convolution kernel in the decoding part to obtain features of different scales. We use this model to conduct experiments on self-collected sonar data sets, which were uploaded on github. The experimental results show that ACC and MIoU reach 0.87 and 0.71, better than other classical small-order semantic segmentation networks. Furthermore, the 347.52 g FOLP and the number of parameters around 13 m also ensure the computing speed and portability of the network. The result can extract the semantic information of the side-scan sonar image and assist with AUV autonomous navigation and mapping.
自主水下航行器(AUV)的导航过程依赖于多种传感器的相互作用。侧扫声呐可以收集水下图像,并在处理后获得语义化的水下环境信息,这将有助于提高AUV自主导航的能力。然而,目前尚无实用的方法来利用侧扫声呐图像的语义信息。本文提出了一种新的卷积神经网络模型来解决这一问题。该模型是一种标准的编解码结构,它从输入图像中提取多通道特征,然后将它们融合以减少参数并增强特征通道的权重。然后,使用更大的卷积核来更有效地提取大规模声呐图像的特征。最后,在解码部分添加一个具有小规模卷积核的并行补偿链路,并与从大卷积核提取的特征进行拼接,以获得不同尺度的特征。我们使用该模型对自行收集的声呐数据集进行实验,这些数据集已上传到github上。实验结果表明,ACC和MIoU分别达到0.87和0.71,优于其他经典的小阶语义分割网络。此外,347.52 g的FOLP和13 m左右的参数数量也保证了网络的计算速度和可移植性。该结果能够提取侧扫声呐图像的语义信息,并辅助AUV自主导航和测绘。