Division of Mechanical and Automotive Engineering, Kongju National University, Cheonan, South Korea.
School of Mechanical Engineering, Korea University of Technology and Education, Cheonan, South Korea.
PLoS One. 2022 Aug 12;17(8):e0272602. doi: 10.1371/journal.pone.0272602. eCollection 2022.
In underwater environment, the study of object recognition is an important basis for implementing an underwater unmanned vessel. For this purpose, abundant experimental data to train deep learning model is required. However, it is very difficult to obtain these data because the underwater experiment itself is very limited in terms of preparation time and resources. In this study, the image transformation model, Pix2Pix is utilized to generate data similar to experimental one obtained by our ROV named SPARUS between the pool and reservoir. These generated data are applied to train the other deep learning model, FCN for a pixel segmentation of images. The original sonar image and its mask image have to be prepared for all training data to train the image segmentation model and it takes a lot of effort to do it what if all training data are supposed to be real sonar images. Fortunately, this burden can be released here, for the pairs of mask image and synthesized sonar image are already consisted in the image transformation step. The validity of the proposed procedures is verified from the performance of the image segmentation result. In this study, when only real sonar images are used for training, the mean accuracy is 0.7525 and the mean IoU is 0.7275. When the both synthetic and real data is used for training, the mean accuracy is 0.81 and the mean IoU is 0.7225. Comparing the results, the performance of mean accuracy increase to 6%, performance of the mean IoU is similar value.
在水下环境中,物体识别的研究是实现水下无人船的重要基础。为此,需要大量的实验数据来训练深度学习模型。然而,由于水下实验本身在准备时间和资源方面非常有限,因此很难获得这些数据。在这项研究中,我们使用图像转换模型 Pix2Pix 在水池和水库之间的我们的 ROV SPARUS 上生成类似实验的图像数据。这些生成的数据被应用于训练另一个深度学习模型 FCN 以对图像进行像素分割。原始声纳图像及其掩模图像必须为所有训练数据准备,以便训练图像分割模型,而如果所有训练数据都应该是真实声纳图像,则这将需要大量的工作。幸运的是,在这里可以减轻这个负担,因为在图像转换步骤中已经包含了掩模图像和合成声纳图像的对。从图像分割结果的性能可以验证所提出的程序的有效性。在这项研究中,当仅使用真实声纳图像进行训练时,平均准确率为 0.7525,平均 IoU 为 0.7275。当同时使用合成和真实数据进行训练时,平均准确率为 0.81,平均 IoU 为 0.7225。比较结果,平均准确率提高了 6%,平均 IoU 具有相似的值。