Qing Shunhao, Qiu Zhaomei, Wang Weili, Wang Fei, Jin Xin, Ji Jiangtao, Zhao Long, Shi Yi
College of Agricultural Equipment Engineering, Henan University of Science and Technology, Luoyang, Henan, China.
Science and Technology Innovation Center for Completed Set Equipment, Longmen Laboratory, Luoyang, China.
Front Plant Sci. 2024 Jun 19;15:1411510. doi: 10.3389/fpls.2024.1411510. eCollection 2024.
The number of wheat spikes has an important influence on wheat yield, and the rapid and accurate detection of wheat spike numbers is of great significance for wheat yield estimation and food security. Computer vision and machine learning have been widely studied as potential alternatives to human detection. However, models with high accuracy are computationally intensive and time consuming, and lightweight models tend to have lower precision. To address these concerns, YOLO-FastestV2 was selected as the base model for the comprehensive study and analysis of wheat sheaf detection. In this study, we constructed a wheat target detection dataset comprising 11,451 images and 496,974 bounding boxes. The dataset for this study was constructed based on the Global Wheat Detection Dataset and the Wheat Sheaf Detection Dataset, which was published by PP Flying Paddle. We selected three attention mechanisms, Large Separable Kernel Attention (LSKA), Efficient Channel Attention (ECA), and Efficient Multi-Scale Attention (EMA), to enhance the feature extraction capability of the backbone network and improve the accuracy of the underlying model. First, the attention mechanism was added after the base and output phases of the backbone network. Second, the attention mechanism that further improved the model accuracy after the base and output phases was selected to construct the model with a two-phase added attention mechanism. On the other hand, we constructed SimLightFPN to improve the model accuracy by introducing SimConv to improve the LightFPN module. The results of the study showed that the YOLO-FastestV2-SimLightFPN-ECA-EMA hybrid model, which incorporates the ECA attention mechanism in the base stage and introduces the EMA attention mechanism and the combination of SimLightFPN modules in the output stage, has the best overall performance. The accuracy of the model was P=83.91%, R=78.35%, AP= 81.52%, and F1 = 81.03%, and it ranked first in the GPI (0.84) in the overall evaluation. The research examines the deployment of wheat ear detection and counting models on devices with constrained resources, delivering novel solutions for the evolution of agricultural automation and precision agriculture.
小麦穗数对小麦产量有重要影响,快速准确地检测小麦穗数对于小麦产量估算和粮食安全具有重要意义。计算机视觉和机器学习作为人工检测的潜在替代方法已得到广泛研究。然而,高精度模型计算量大且耗时,而轻量级模型往往精度较低。为解决这些问题,选择YOLO-FastestV2作为基础模型,对麦穗检测进行全面研究和分析。在本研究中,我们构建了一个包含11451张图像和496974个边界框的小麦目标检测数据集。本研究的数据集是基于全球小麦检测数据集和PP飞桨发布的麦穗检测数据集构建的。我们选择了三种注意力机制,即大分离核注意力(LSKA)、高效通道注意力(ECA)和高效多尺度注意力(EMA),以增强骨干网络的特征提取能力并提高基础模型的准确性。首先,在骨干网络的基础阶段和输出阶段之后添加注意力机制。其次,选择在基础阶段和输出阶段之后进一步提高模型准确性的注意力机制,构建具有两阶段添加注意力机制的模型。另一方面,我们构建了SimLightFPN,通过引入SimConv改进LightFPN模块来提高模型准确性。研究结果表明,在基础阶段结合ECA注意力机制、在输出阶段引入EMA注意力机制和SimLightFPN模块组合的YOLO-FastestV2-SimLightFPN-ECA-EMA混合模型具有最佳的整体性能。该模型的准确率为P=83.91%,召回率为R=78.35%,平均精度为AP=81.52%,F1值为81.03%,在整体评估中的GPI(0.84)排名第一。该研究考察了麦穗检测与计数模型在资源受限设备上的部署,为农业自动化和精准农业的发展提供了新的解决方案。