Suppr超能文献

利用视觉Transformer和基于注意力的卷积神经网络优化草莓病害与品质检测

Optimizing Strawberry Disease and Quality Detection with Vision Transformers and Attention-Based Convolutional Neural Networks.

作者信息

Aghamohammadesmaeilketabforoosh Kimia, Nikan Soodeh, Antonini Giorgio, Pearce Joshua M

机构信息

Department of Electrical & Computer Engineering, Western University, London, ON N6A 3K7, Canada.

Ivey Business School, Western University, London, ON N6A 3K7, Canada.

出版信息

Foods. 2024 Jun 14;13(12):1869. doi: 10.3390/foods13121869.

Abstract

Machine learning and computer vision have proven to be valuable tools for farmers to streamline their resource utilization to lead to more sustainable and efficient agricultural production. These techniques have been applied to strawberry cultivation in the past with limited success. To build on this past work, in this study, two separate sets of strawberry images, along with their associated diseases, were collected and subjected to resizing and augmentation. Subsequently, a combined dataset consisting of nine classes was utilized to fine-tune three distinct pretrained models: vision transformer (ViT), MobileNetV2, and ResNet18. To address the imbalanced class distribution in the dataset, each class was assigned weights to ensure nearly equal impact during the training process. To enhance the outcomes, new images were generated by removing backgrounds, reducing noise, and flipping them. The performances of ViT, MobileNetV2, and ResNet18 were compared after being selected. Customization specific to the task was applied to all three algorithms, and their performances were assessed. Throughout this experiment, none of the layers were frozen, ensuring all layers remained active during training. Attention heads were incorporated into the first five and last five layers of MobileNetV2 and ResNet18, while the architecture of ViT was modified. The results indicated accuracy factors of 98.4%, 98.1%, and 97.9% for ViT, MobileNetV2, and ResNet18, respectively. Despite the data being imbalanced, the precision, which indicates the proportion of correctly identified positive instances among all predicted positive instances, approached nearly 99% with the ViT. MobileNetV2 and ResNet18 demonstrated similar results. Overall, the analysis revealed that the vision transformer model exhibited superior performance in strawberry ripeness and disease classification. The inclusion of attention heads in the early layers of ResNet18 and MobileNet18, along with the inherent attention mechanism in ViT, improved the accuracy of image identification. These findings offer the potential for farmers to enhance strawberry cultivation through passive camera monitoring alone, promoting the health and well-being of the population.

摘要

机器学习和计算机视觉已被证明是农民优化资源利用、实现更可持续高效农业生产的宝贵工具。过去这些技术应用于草莓种植时成效有限。为在以往工作基础上更进一步,本研究收集了两组不同的草莓图像及其相关病害,并进行了图像缩放和增强处理。随后,利用一个包含九个类别的组合数据集对三个不同的预训练模型进行微调:视觉Transformer(ViT)、MobileNetV2和ResNet18。为解决数据集中类别分布不均衡的问题,给每个类别分配权重,以确保在训练过程中产生近乎相等的影响。为提高结果,通过去除背景、减少噪声和翻转图像来生成新图像。在选定ViT、MobileNetV2和ResNet18后,比较了它们的性能。对所有三种算法都应用了特定于任务的定制,并评估了它们的性能。在整个实验过程中,没有冻结任何层,确保所有层在训练期间保持活跃。在MobileNetV2和ResNet18的前五层和后五层中加入了注意力头,同时修改了ViT的架构。结果表明,ViT、MobileNetV2和ResNet18的准确率分别为98.4%、98.1%和97.9%。尽管数据不均衡,但ViT的精确率(表示在所有预测为阳性的实例中正确识别的阳性实例的比例)接近99%。MobileNetV2和ResNet18也有类似结果。总体而言,分析表明视觉Transformer模型在草莓成熟度和病害分类方面表现出卓越性能。在ResNet18和MobileNet18早期层中加入注意力头,以及ViT中固有的注意力机制,提高了图像识别的准确率。这些发现为农民仅通过被动摄像头监测就能提高草莓种植水平提供了可能性,从而促进民众的健康和福祉。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e2b/11202458/cae1b50c68d9/foods-13-01869-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验