Suppr超能文献

基于前视声纳和迁移学习方法的未爆弹药分类声学图像数据集的开发

On the Development of an Acoustic Image Dataset for Unexploded Ordnance Classification Using Front-Looking Sonar and Transfer Learning Methods.

作者信息

Ściegienka Piotr, Blachnik Marcin

机构信息

Joint Doctoral School, Silesian University of Technology, 44-100 Gliwice, Poland.

SR Robotics Sp. z o.o., Lwowska 38, 40-389 Katowice, Poland.

出版信息

Sensors (Basel). 2024 Sep 13;24(18):5946. doi: 10.3390/s24185946.

Abstract

This research aimed to develop a dataset of acoustic images recorded by a forward-looking sonar mounted on an underwater vehicle, enabling the classification of unexploded ordnances (UXOs) and objects other than unexploded ordnance (nonUXOs). The dataset was obtained using digital twin simulations performed in the Gazebo environment utilizing plugins developed within the DAVE project. It consists of 69,444 sample images of 512 × 399 resolution organized in two classes annotated as UXO and nonUXO. The obtained dataset was then evaluated by state-of-the-art image classification methods using off-the-shelf models and transfer learning techniques. The research included VGG16, ResNet34, ResNet50, ViT, RegNet, and Swin Transformer. Its goal was to define a base rate for the development of other specialized machine learning models. Neural network experiments comprised two stages-pre-training of only the final layers and pre-training of the entire network. The experiments revealed that to obtain high accuracy, it is required to pre-train the entire network, under which condition, all the models achieved comparable performance, reaching 98% balanced accuracy. Surprisingly, the highest accuracy was obtained by the VGG model.

摘要

本研究旨在开发一个数据集,该数据集由安装在水下航行器上的前视声呐记录的声学图像组成,用于对未爆炸弹药(UXO)和未爆炸弹药以外的物体(非UXO)进行分类。该数据集是通过在Gazebo环境中使用DAVE项目中开发的插件进行数字孪生模拟获得的。它由69444张分辨率为512×399的样本图像组成,分为两类,分别标注为UXO和非UXO。然后,使用现成的模型和迁移学习技术,通过先进的图像分类方法对获得的数据集进行评估。该研究包括VGG16、ResNet34、ResNet50、ViT、RegNet和Swin Transformer。其目标是为其他专门的机器学习模型的开发定义一个基准率。神经网络实验包括两个阶段——仅对最后一层进行预训练和对整个网络进行预训练。实验表明,为了获得高精度,需要对整个网络进行预训练,在这种情况下,所有模型都取得了可比的性能,平衡准确率达到98%。令人惊讶的是,VGG模型获得了最高的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c69/11436174/b537fa00eb60/sensors-24-05946-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验