Wang Fenghua, Jiang Jin, Chen Yu, Sun Zhexing, Tang Yuan, Lai Qinghui, Zhu Hailong
Faculty of Modern Agricultural Engineering, Kunming University of Science and Technology, Kunming, Yunnan, China.
Engineering Training Center, Kunming University of Science and Technology, Kunming, Yunnan, China.
Front Plant Sci. 2023 Jun 5;14:1200144. doi: 10.3389/fpls.2023.1200144. eCollection 2023.
Real-time fruit detection is a prerequisite for using the Xiaomila pepper harvesting robot in the harvesting process.
To reduce the computational cost of the model and improve its accuracy in detecting dense distributions and occluded Xiaomila objects, this paper adopts YOLOv7-tiny as the transfer learning model for the field detection of Xiaomila, collects images of immature and mature Xiaomila fruits under different lighting conditions, and proposes an effective model called YOLOv7-PD. Firstly, the main feature extraction network is fused with deformable convolution by replacing the traditional convolution module in the YOLOv7-tiny main network and the ELAN module with deformable convolution, which reduces network parameters while improving the detection accuracy of multi-scale Xiaomila targets. Secondly, the SE (Squeeze-and-Excitation) attention mechanism is introduced into the reconstructed main feature extraction network to improve its ability to extract key features of Xiaomila in complex environments, realizing multi-scale Xiaomila fruit detection. The effectiveness of the proposed method is verified through ablation experiments under different lighting conditions and model comparison experiments.
The experimental results indicate that YOLOv7-PD achieves higher detection performance than other single-stage detection models. Through these improvements, YOLOv7-PD achieves a mAP (mean Average Precision) of 90.3%, which is 2.2%, 3.6%, and 5.5% higher than that of the original YOLOv7-tiny, YOLOv5s, and Mobilenetv3 models, respectively, the model size is reduced from 12.7 MB to 12.1 MB, and the model's unit time computation is reduced from 13.1 GFlops to 10.3 GFlops.
The results shows that compared to existing models, this model is more effective in detecting Xiaomila fruits in images, and the computational complexity of the model is smaller.
实时水果检测是在收获过程中使用小米辣采摘机器人的前提条件。
为降低模型的计算成本并提高其在检测密集分布和遮挡的小米辣物体时的准确性,本文采用YOLOv7-tiny作为小米辣田间检测的迁移学习模型,收集不同光照条件下未成熟和成熟小米辣果实的图像,并提出一种名为YOLOv7-PD的有效模型。首先,通过在YOLOv7-tiny主网络和ELAN模块中用可变形卷积替换传统卷积模块,将主要特征提取网络与可变形卷积融合,这在减少网络参数的同时提高了多尺度小米辣目标的检测精度。其次,将SE(挤压与激励)注意力机制引入重构后的主要特征提取网络,以提高其在复杂环境中提取小米辣关键特征的能力,实现多尺度小米辣果实检测。通过不同光照条件下的消融实验和模型对比实验验证了所提方法的有效性。
实验结果表明,YOLOv7-PD比其他单阶段检测模型具有更高的检测性能。通过这些改进,YOLOv7-PD的平均精度均值(mAP)达到90.3%,分别比原始的YOLOv7-tiny、YOLOv5s和Mobilenetv3模型高2.2%、3.6%和5.5%,模型大小从12.7MB减少到12.1MB,模型的单位时间计算量从13.1 GFlops减少到10.3 GFlops。
结果表明,与现有模型相比,该模型在检测图像中的小米辣果实方面更有效,且模型的计算复杂度更小。