School of Automation, Hangzhou Dianzi University, Hangzhou 310018, China.
Sensors (Basel). 2022 Apr 23;22(9):3243. doi: 10.3390/s22093243.
Ship recognition is a fundamental and essential step in maritime activities, and it can be widely used in maritime rescue, vessel management, and other applications. However, most studies conducted in this area use synthetic aperture radar (SAR) images and space-borne optical images, and those studies utilizing visible images are limited to the coarse-grained level. In this study, we constructed a fine-grained ship dataset with real images and simulation images that consisted of five categories of ships. To solve the problem of low accuracy in fine-grained ship classification with different angles in visible images, a network based on domain adaptation and a transformer was proposed. Concretely, style transfer was first used to reduce the gap between the simulation images and real images. Then, with the goal of utilizing the simulation images to execute classification tasks on the real images, a domain adaptation network based on local maximum mean discrepancy (LMMD) was used to align the different domain distributions. Furthermore, considering the innate attention mechanism of the transformer, a vision transformer (ViT) was chosen as the feature extraction module to extract the fine-grained features, and a fully connected layer was used as the classifier. Finally, the experimental results showed that our network had good performance on the fine-grained ship dataset with an overall accuracy rate of 96.0%, and the mean average precision (mAP) of detecting first and then classifying with our network was 87.5%, which also verified the feasibility of using images generated by computer simulation technology for auxiliary training.
船舶识别是海上活动的基本和重要步骤,可广泛应用于海上救援、船只管理等领域。然而,该领域的大多数研究都使用合成孔径雷达 (SAR) 图像和星载光学图像,而那些利用可见光图像的研究仅限于粗粒度水平。在这项研究中,我们构建了一个包含五类船舶的真实图像和模拟图像的细粒度船舶数据集。为了解决可见光图像中不同角度的细粒度船舶分类精度低的问题,提出了一种基于域自适应和转换器的网络。具体来说,首先使用风格迁移来缩小模拟图像和真实图像之间的差距。然后,为了利用模拟图像对真实图像执行分类任务,使用基于局部最大均值差异 (LMMD) 的域自适应网络来对齐不同的域分布。此外,考虑到转换器的固有注意力机制,选择视觉转换器 (ViT) 作为特征提取模块来提取细粒度特征,并使用全连接层作为分类器。最后,实验结果表明,我们的网络在细粒度船舶数据集上具有良好的性能,总体准确率为 96.0%,并且使用我们的网络进行先检测后分类的平均精度 (mAP) 为 87.5%,这也验证了使用计算机模拟技术生成的图像进行辅助训练的可行性。