Wang Yong, Xu ShunFa, Ye ZhenYuan, Cheng KongHao
Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310000, China.
Sci Rep. 2025 Apr 7;15(1):11854. doi: 10.1038/s41598-025-96826-6.
The application of intelligent agricultural machinery is crucial in modern agricultural production. However, in environments where the target and the surrounding morphology are highly similar, such as distinguishing sesame seedlings from weeds, the problem essentially becomes one of optimizing edge detection algorithms for similar targets. To address this issue in agricultural object detection, we developed a custom dataset containing 1,300 images of sesame seedlings and weeds. To overcome the high complexity and low detection accuracy limitations of the original DINO model for this problem, the backbone network was replaced with MobileNet V3, the SENet attention mechanism and neck structure were optimized, and the H-Swish6 activation function was introduced to suit edge devices. Given the higher degree of lignification in the stems of sesame seedlings, these modifications improved the overall Average Precision (AP) of the model on the COCO dataset by 5.1% compared to the original DINO model. Specifically, [Formula: see text] and [Formula: see text] increased by 3.3% and 3.8%, respectively, while [Formula: see text] and [Formula: see text] increased by 2.3% and 3.2%. The model's parameter count was reduced to 29M, inference time was lowered by 60%, and computational cost in FLOPs decreased by 43.72%. To verify the effectiveness of the improvements, we developed a custom dataset containing 1,300 images of sesame seedlings and weeds. On this model, the improved DINO model achieved a maximum AP of 81.8%, outperforming the YOLOv7 model by 5.6%, with an FPS of 24 frames per second. Ablation experiments verified the effectiveness of the model improvements.However, the aforementioned studies have not addressed the issue of low detection accuracy in scenarios with similar targets in the agricultural domain.
智能农业机械的应用在现代农业生产中至关重要。然而,在目标与周围形态高度相似的环境中,例如区分芝麻幼苗和杂草,问题本质上就变成了针对相似目标优化边缘检测算法。为了解决农业目标检测中的这个问题,我们开发了一个包含1300张芝麻幼苗和杂草图像的自定义数据集。为了克服原始DINO模型在这个问题上的高复杂性和低检测精度限制,我们将骨干网络替换为MobileNet V3,优化了SENet注意力机制和颈部结构,并引入了H-Swish6激活函数以适配边缘设备。鉴于芝麻幼苗茎部的木质化程度较高,这些改进使模型在COCO数据集上的总体平均精度(AP)相比原始DINO模型提高了5.1%。具体而言,[公式:见原文]和[公式:见原文]分别提高了3.3%和3.8%,而[公式:见原文]和[公式:见原文]分别提高了2.3%和3.2%。模型的参数数量减少到29M,推理时间降低了60%,浮点运算中的计算成本降低了43.72%。为了验证改进的有效性,我们开发了一个包含1300张芝麻幼苗和杂草图像的自定义数据集。在这个模型上,改进后的DINO模型实现了81.8%的最大AP,比YOLOv7模型高出5.6%,帧率为每秒24帧。消融实验验证了模型改进的有效性。然而,上述研究尚未解决农业领域中相似目标场景下检测精度低的问题。