Guo Canzhi, Zheng Shiwu, Cheng Guanggui, Zhang Yue, Ding Jianning
Institute of Intelligent Flexible Mechatronics, Jiangsu University, Zhenjiang, China.
Jiangsu Collaborative Innovation Center of Photovoltaic Science and Engineering, Changzhou, China.
Front Plant Sci. 2023 Jul 13;14:1209910. doi: 10.3389/fpls.2023.1209910. eCollection 2023.
Visual recognition is the most critical function of a harvesting robot, and the accuracy of the harvesting action is based on the performance of visual recognition. However, unstructured environment, such as severe occlusion, fruits overlap, illumination changes, complex backgrounds, and even heavy fog weather, pose series of serious challenges to the detection accuracy of the recognition algorithm. Hence, this paper proposes an improved YOLO v4 model, called YOLO v4+, to cope with the challenges brought by unstructured environment. The output of each Resblock_body in the backbone is processed using a simple, parameterless attention mechanism for full dimensional refinement of extracted features. Further, in order to alleviate the problem of feature information loss, a multi scale feature fusion module with fusion weight and jump connection structure was pro-posed. In addition, the focal loss function is adopted and the hyperparameters α, γ are adjusted to 0.75 and 2. The experimental results show that the average precision of the YOLO v4+ model is 94.25% and the F1 score is 93%, which is 3.35% and 3% higher than the original YOLO v4 respectively. Compared with several state-of-the-art detection models, YOLO v4+ not only has the highest comprehensive ability, but also has better generalization ability. Selecting the corresponding augmentation method for specific working condition can greatly improve the model detection accuracy. Applying the proposed method to harvesting robots may enhance the applicability and robustness of the robotic system.
视觉识别是采摘机器人最关键的功能,采摘动作的准确性基于视觉识别的性能。然而,非结构化环境,如严重遮挡、果实重叠、光照变化、复杂背景,甚至大雾天气,给识别算法的检测精度带来了一系列严峻挑战。因此,本文提出了一种改进的YOLO v4模型,即YOLO v4+,以应对非结构化环境带来的挑战。主干网络中每个Resblock_body的输出通过一个简单的、无参数的注意力机制进行处理,以对提取的特征进行全维度细化。此外,为了缓解特征信息丢失的问题,提出了一种具有融合权重和跳跃连接结构的多尺度特征融合模块。另外,采用了焦点损失函数,并将超参数α、γ调整为0.75和2。实验结果表明,YOLO v4+模型的平均精度为94.25%,F1分数为93%,分别比原始YOLO v4高3.35%和3%。与几种先进的检测模型相比,YOLO v4+不仅具有最高的综合能力,而且具有更好的泛化能力。针对特定工作条件选择相应的增强方法可以大大提高模型的检测精度。将所提出的方法应用于采摘机器人可能会增强机器人系统的适用性和鲁棒性。