Wang Qi, Wu Meihan, Yu Fei, Feng Chen, Li Kaige, Zhu Yuemei, Rigall Eric, He Bo
School of Information Science and Engineering, Ocean University of China, Qingdao 266000, China.
Sensors (Basel). 2019 Apr 28;19(9):1985. doi: 10.3390/s19091985.
Real-time processing of high-resolution sonar images is of great significance for the autonomy and intelligence of autonomous underwater vehicle (AUV) in complex marine environments. In this paper, we propose a real-time semantic segmentation network termed RT-Seg for Side-Scan Sonar (SSS) images. The proposed architecture is based on a novel encoder-decoder structure, in which the encoder blocks utilized Depth-Wise Separable Convolution and a 2-way branch for improving performance, and a corresponding decoder network is implemented to restore the details of the targets, followed by a pixel-wise classification layer. Moreover, we use patch-wise strategy for splitting the high-resolution image into local patches and applying them to network training. The well-trained model is used for testing high-resolution SSS images produced by sonar sensor in an onboard Graphic Processing Unit (GPU). The experimental results show that RT-Seg can greatly reduce the number of parameters and floating point operations compared to other networks. It runs at 25.67 frames per second on an NVIDIA Jetson AGX Xavier on 500*500 inputs with excellent segmentation result. Further insights on the speed and accuracy trade-off are discussed in this paper.
在复杂海洋环境中,高分辨率声纳图像的实时处理对于自主水下航行器(AUV)的自主性和智能化具有重要意义。本文提出了一种用于侧扫声纳(SSS)图像的实时语义分割网络,称为RT-Seg。所提出的架构基于一种新颖的编码器-解码器结构,其中编码器模块利用深度可分离卷积和双向分支来提高性能,并实现相应的解码器网络以恢复目标的细节,随后是逐像素分类层。此外,我们采用分块策略将高分辨率图像分割成局部块并将其应用于网络训练。训练良好的模型用于在机载图形处理单元(GPU)中测试声纳传感器产生的高分辨率SSS图像。实验结果表明,与其他网络相比,RT-Seg可以大大减少参数数量和浮点运算次数。在500*500输入的情况下,它在NVIDIA Jetson AGX Xavier上以每秒25.67帧的速度运行,分割结果优异。本文还讨论了关于速度和准确性权衡的进一步见解。