College of Information Engineering, Sichuan Agricultural University, Ya'an, China.
College of Electrical Engineering, Sichuan Agricultural University, Ya'an, China.
PLoS One. 2024 Mar 21;19(3):e0299902. doi: 10.1371/journal.pone.0299902. eCollection 2024.
Accurate identification of small tea buds is a key technology for tea harvesting robots, which directly affects tea quality and yield. However, due to the complexity of the tea plantation environment and the diversity of tea buds, accurate identification remains an enormous challenge. Current methods based on traditional image processing and machine learning fail to effectively extract subtle features and morphology of small tea buds, resulting in low accuracy and robustness. To achieve accurate identification, this paper proposes a small object detection algorithm called STF-YOLO (Small Target Detection with Swin Transformer and Focused YOLO), which integrates the Swin Transformer module and the YOLOv8 network to improve the detection ability of small objects. The Swin Transformer module extracts visual features based on a self-attention mechanism, which captures global and local context information of small objects to enhance feature representation. The YOLOv8 network is an object detector based on deep convolutional neural networks, offering high speed and precision. Based on the YOLOv8 network, modules including Focus and Depthwise Convolution are introduced to reduce computation and parameters, increase receptive field and feature channels, and improve feature fusion and transmission. Additionally, the Wise Intersection over Union loss is utilized to optimize the network. Experiments conducted on a self-created dataset of tea buds demonstrate that the STF-YOLO model achieves outstanding results, with an accuracy of 91.5% and a mean Average Precision of 89.4%. These results are significantly better than other detectors. Results show that, compared to mainstream algorithms (YOLOv8, YOLOv7, YOLOv5, and YOLOx), the model improves accuracy and F1 score by 5-20.22 percentage points and 0.03-0.13, respectively, proving its effectiveness in enhancing small object detection performance. This research provides technical means for the accurate identification of small tea buds in complex environments and offers insights into small object detection. Future research can further optimize model structures and parameters for more scenarios and tasks, as well as explore data augmentation and model fusion methods to improve generalization ability and robustness.
准确识别小茶芽是采茶机器人的关键技术,直接影响茶叶的质量和产量。然而,由于茶园环境的复杂性和茶芽的多样性,准确识别仍然是一个巨大的挑战。目前基于传统图像处理和机器学习的方法无法有效地提取小茶芽的细微特征和形态,导致准确性和鲁棒性较低。为了实现准确识别,本文提出了一种名为 STF-YOLO(基于 Swin Transformer 和 Focused YOLO 的小目标检测)的小目标检测算法,该算法集成了 Swin Transformer 模块和 YOLOv8 网络,以提高小目标的检测能力。Swin Transformer 模块基于自注意力机制提取视觉特征,捕捉小目标的全局和局部上下文信息,增强特征表示。YOLOv8 网络是一种基于深度卷积神经网络的目标检测器,具有高速和高精度。在 YOLOv8 网络的基础上,引入了 Focus 和 Depthwise Convolution 等模块,以减少计算量和参数,增加感受野和特征通道,提高特征融合和传输效果。此外,还采用了 Wise 交并比损失来优化网络。在自制的茶芽数据集上进行的实验表明,STF-YOLO 模型取得了优异的结果,准确率为 91.5%,平均准确率为 89.4%。这些结果明显优于其他检测器。结果表明,与主流算法(YOLOv8、YOLOv7、YOLOv5 和 YOLOx)相比,该模型的准确率和 F1 得分分别提高了 5-20.22 个百分点和 0.03-0.13,证明了其在提高小目标检测性能方面的有效性。本研究为复杂环境下小茶芽的准确识别提供了技术手段,为小目标检测提供了新的思路。未来的研究可以进一步优化模型结构和参数,以适应更多的场景和任务,同时探索数据增强和模型融合方法,以提高模型的泛化能力和鲁棒性。