具有扩展视觉的高效优化YOLOv8模型

Efficient Optimized YOLOv8 Model with Extended Vision.

作者信息

Zhou Qi, Wang Zhou, Zhong Yiwen, Zhong Fenglin, Wang Lijin

机构信息

College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.

Key Laboratory of Smart Agriculture and Forestry, Fujian Province University, Fuzhou 350002, China.

出版信息

Sensors (Basel). 2024 Oct 10;24(20):6506. doi: 10.3390/s24206506.

DOI:10.3390/s24206506

PMID:39459988

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11510833/

Abstract

In the field of object detection, enhancing algorithm performance in complex scenarios represents a fundamental technological challenge. To address this issue, this paper presents an efficient optimized YOLOv8 model with extended vision (YOLO-EV), which optimizes the performance of the YOLOv8 model through a series of innovative improvement measures and strategies. First, we propose a multi-branch group-enhanced fusion attention (MGEFA) module and integrate it into YOLO-EV, which significantly boosts the model's feature extraction capabilities. Second, we enhance the existing spatial pyramid pooling fast (SPPF) layer by integrating large scale kernel attention (LSKA), improving the model's efficiency in processing spatial information. Additionally, we replace the traditional IOU loss function with the Wise-IOU loss function, thereby enhancing localization accuracy across various target sizes. We also introduce a P6 layer to augment the model's detection capabilities for multi-scale targets. Through network structure optimization, we achieve higher computational efficiency, ensuring that YOLO-EV consumes fewer computational resources than YOLOv8s. In the validation section, preliminary tests on the VOC12 dataset demonstrate YOLO-EV's effectiveness in standard object detection tasks. Moreover, YOLO-EV has been applied to the CottonWeedDet12 and CropWeed datasets, which are characterized by complex scenes, diverse weed morphologies, significant occlusions, and numerous small targets. Experimental results indicate that YOLO-EV exhibits superior detection accuracy in these complex agricultural environments compared to the original YOLOv8s and other state-of-the-art models, effectively identifying and locating various types of weeds, thus demonstrating its significant practical application potential.

摘要

在目标检测领域，提高复杂场景下的算法性能是一项基本的技术挑战。为解决这一问题，本文提出了一种具有扩展视觉的高效优化YOLOv8模型（YOLO-EV），该模型通过一系列创新的改进措施和策略来优化YOLOv8模型的性能。首先，我们提出了一种多分支组增强融合注意力（MGEFA）模块，并将其集成到YOLO-EV中，这显著提升了模型的特征提取能力。其次，我们通过集成大规模内核注意力（LSKA）来增强现有的空间金字塔池化快速（SPPF）层，提高模型处理空间信息的效率。此外，我们用Wise-IOU损失函数取代传统的IOU损失函数，从而提高各种目标尺寸下的定位精度。我们还引入了一个P6层来增强模型对多尺度目标的检测能力。通过网络结构优化，我们实现了更高的计算效率，确保YOLO-EV比YOLOv8s消耗更少的计算资源。在验证部分，对VOC12数据集的初步测试证明了YOLO-EV在标准目标检测任务中的有效性。此外，YOLO-EV已应用于CottonWeedDet12和CropWeed数据集，这些数据集具有场景复杂、杂草形态多样、遮挡严重和小目标众多的特点。实验结果表明，与原始的YOLOv8s和其他先进模型相比，YOLO-EV在这些复杂的农业环境中表现出卓越的检测精度，能够有效地识别和定位各种类型的杂草，从而证明了其显著的实际应用潜力。