Yang Jikai, Li Zihan, Gu Ziyan, Li Wei
School of Naval Architecture and Ocean Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China.
Sci Rep. 2024 Dec 30;14(1):32086. doi: 10.1038/s41598-024-83543-9.
With the advancement of artificial intelligence technology, unmanned boats utilizing deep learning models have shown significant potential in water surface garbage classification. This study employs Convolutional Neural Network (CNN) to extract features of water surface floating objects and constructs the VGG16-15 model based on the VGG-16 architecture, capable of identifying 15 common types of water surface floatables. A garbage classification dataset was curated to obtain 5707 images belonging to 15 categories, which were then split into training and validation sets in a 4:1 ratio. Customized improvements were made on the base VGG-16 model, including adjusting the neural network structure to suit 15 floating object categories, applying learning rate decay and early stopping strategies for model optimization, and using data augmentation to enhance model generalization. By tweaking certain parameters, the study analyzed the impact of the number of epochs and batch sizes on the model's classification effectiveness. The results show that the model achieves the best performance with 20 epochs and a batch size of 64, reaching a recognition accuracy of 93.86%. This is a 10.09% improvement over the traditional VGG-16 model and a 4.91% increase compared to the model without data augmentation, demonstrating the effectiveness of model improvements and data augmentation in enhancing image recognition capabilities. Additionally, the few-shot test demonstrates the fine-tuned model's improved generalization capability. This research illustrates the applicability of transfer learning in the task of water surface garbage classification and provides technical support for the application of unmanned boats in environmental protection.
随着人工智能技术的进步,利用深度学习模型的无人船在水面垃圾分类方面展现出巨大潜力。本研究采用卷积神经网络(CNN)来提取水面漂浮物的特征,并基于VGG - 16架构构建了VGG16 - 15模型,该模型能够识别15种常见的水面漂浮物类型。精心整理了一个垃圾分类数据集,获得了属于15个类别的5707张图像,然后将其按4:1的比例划分为训练集和验证集。对基础VGG - 16模型进行了定制改进,包括调整神经网络结构以适应15种漂浮物类别、应用学习率衰减和提前停止策略进行模型优化,以及使用数据增强来提高模型的泛化能力。通过调整某些参数,该研究分析了轮次数量和批量大小对模型分类效果的影响。结果表明,该模型在20个轮次和批量大小为64时性能最佳,识别准确率达到93.86%。这比传统的VGG - 16模型提高了10.09%,与未进行数据增强的模型相比提高了4.91%,证明了模型改进和数据增强在提升图像识别能力方面的有效性。此外,少样本测试展示了微调模型改进后的泛化能力。本研究说明了迁移学习在水面垃圾分类任务中的适用性,并为无人船在环境保护中的应用提供了技术支持。