Institute for Future Farming Systems, Central Queensland University, Rockhampton 4701, Australia.
Sensors (Basel). 2024 Aug 29;24(17):5593. doi: 10.3390/s24175593.
Recent developments in affordable depth imaging hardware and the use of 2D Convolutional Neural Networks (CNN) in object detection and segmentation have accelerated the adoption of machine vision in a range of applications, with mainstream models often out-performing previous application-specific architectures. The need for the release of training and test datasets with any work reporting model development is emphasized to enable the re-evaluation of published work. An additional reporting need is the documentation of the performance of the re-training of a given model, quantifying the impact of stochastic processes in training. Three mango orchard applications were considered: the (i) fruit count, (ii) fruit size and (iii) branch avoidance in automated harvesting. All training and test datasets used in this work are available publicly. The mAP 'coefficient of variation' (Standard Deviation, SD, divided by mean of predictions using models of repeated trainings × 100) was approximately 0.2% for the fruit detection model and 1 and 2% for the fruit and branch segmentation models, respectively. A YOLOv8m model achieved a mAP50 of 99.3%, outperforming the previous benchmark, the purpose-designed 'MangoYOLO', for the application of the real-time detection of mango fruit on images of tree canopies using an edge computing device as a viable use case. YOLOv8 and v9 models outperformed the benchmark MaskR-CNN model in terms of their accuracy and inference time, achieving up to a 98.8% mAP50 on fruit predictions and 66.2% on branches in a leafy canopy. For fruit sizing, the accuracy of YOLOv8m-seg was like that achieved using Mask R-CNN, but the inference time was much shorter, again an enabler for the field adoption of this technology. A branch avoidance algorithm was proposed, where the implementation of this algorithm in real-time on an edge computing device was enabled by the short inference time of a YOLOv8-seg model for branches and fruit. This capability contributes to the development of automated fruit harvesting.
近年来,经济实惠的深度成像硬件的发展以及二维卷积神经网络(CNN)在目标检测和分割中的应用,加速了机器视觉在一系列应用中的采用,主流模型的性能通常优于以前的特定于应用的架构。强调在报告模型开发的任何工作时需要发布培训和测试数据集,以实现对已发布工作的重新评估。另一个需要报告的问题是记录给定模型重新训练的性能,量化训练中随机过程的影响。考虑了三个芒果果园应用:(i)水果计数,(ii)水果大小和(iii)自动收获中的分支避让。这项工作中使用的所有培训和测试数据集都公开可用。水果检测模型的 mAP“变异系数”(标准偏差除以使用重复训练的模型进行预测的平均值×100)约为 0.2%,水果和分支分割模型分别为 1%和 2%。YOLOv8m 模型的 mAP50 达到 99.3%,优于以前的基准“MangoYOLO”,可作为实时检测芒果果实的用例,该用例使用边缘计算设备对树冠图像进行检测。YOLOv8 和 v9 模型在准确性和推理时间方面优于基准 MaskR-CNN 模型,在果实预测方面达到高达 98.8%的 mAP50,在有叶树冠中达到 66.2%的分支预测。对于果实尺寸,YOLOv8m-seg 的准确性与使用 Mask R-CNN 实现的准确性相当,但推理时间要短得多,这再次为该技术在现场的采用提供了支持。提出了一种分支避让算法,通过 YOLOv8-seg 模型对分支和果实的短推理时间,在边缘计算设备上实时实现了该算法。这一功能有助于实现水果的自动收获。