一种基于Swin Transformer和三叉戟金字塔网络的改进YOLOv7模型，用于精确的番茄检测。

An improved YOLOv7 model based on Swin Transformer and Trident Pyramid Networks for accurate tomato detection.

作者信息

Liu Guoxu, Zhang Yonghui, Liu Jun, Liu Deyong, Chen Chunlei, Li Yujie, Zhang Xiujie, Touko Mbouembe Philippe Lyonel

机构信息

School of Computer Engineering, Weifang University, Weifang, China.

Shandong Provincial University Laboratory for Protected Horticulture, Weifang University of Science and Technology, Weifang, China.

出版信息

Front Plant Sci. 2024 Sep 26;15:1452821. doi: 10.3389/fpls.2024.1452821. eCollection 2024.

DOI:10.3389/fpls.2024.1452821

PMID:39391778

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11464322/

Abstract

Accurate fruit detection is crucial for automated fruit picking. However, real-world scenarios, influenced by complex environmental factors such as illumination variations, occlusion, and overlap, pose significant challenges to accurate fruit detection. These challenges subsequently impact the commercialization of fruit harvesting robots. A tomato detection model named YOLO-SwinTF, based on YOLOv7, is proposed to address these challenges. Integrating Swin Transformer (ST) blocks into the backbone network enables the model to capture global information by modeling long-range visual dependencies. Trident Pyramid Networks (TPN) are introduced to overcome the limitations of PANet's focus on communication-based processing. TPN incorporates multiple self-processing (SP) modules within existing top-down and bottom-up architectures, allowing feature maps to generate new findings for communication. In addition, Focaler-IoU is introduced to reconstruct the original intersection-over-union (IoU) loss to allow the loss function to adjust its focus based on the distribution of difficult and easy samples. The proposed model is evaluated on a tomato dataset, and the experimental results demonstrated that the proposed model's detection recall, precision, F score, and AP reach 96.27%, 96.17%, 96.22%, and 98.67%, respectively. These represent improvements of 1.64%, 0.92%, 1.28%, and 0.88% compared to the original YOLOv7 model. When compared to other state-of-the-art detection methods, this approach achieves superior performance in terms of accuracy while maintaining comparable detection speed. In addition, the proposed model exhibits strong robustness under various lighting and occlusion conditions, demonstrating its significant potential in tomato detection.

摘要

准确的水果检测对于自动化水果采摘至关重要。然而，受光照变化、遮挡和重叠等复杂环境因素影响的现实场景，给准确的水果检测带来了重大挑战。这些挑战随后影响了水果采摘机器人的商业化。为应对这些挑战，提出了一种基于YOLOv7的名为YOLO-SwinTF的番茄检测模型。将Swin Transformer（ST）模块集成到骨干网络中，使该模型能够通过对长距离视觉依赖进行建模来捕获全局信息。引入三叉戟金字塔网络（TPN）以克服PANet专注于基于通信的处理的局限性。TPN在现有的自上而下和自下而上架构中纳入了多个自处理（SP）模块，允许特征图生成用于通信的新结果。此外，引入了Focaler-IoU来重构原始的交并比（IoU）损失，使损失函数能够根据难易样本的分布调整其关注点。在一个番茄数据集上对所提出的模型进行了评估，实验结果表明，所提出模型的检测召回率、精确率、F分数和平均精度分别达到96.27%、96.17%、96.22%和98.67%。与原始的YOLOv7模型相比，这些分别提高了1.64%、0.92%、1.28%和0.88%。与其他先进的检测方法相比，该方法在保持可比检测速度的同时，在准确性方面取得了卓越的性能。此外，所提出的模型在各种光照和遮挡条件下表现出很强的鲁棒性，证明了其在番茄检测中的巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de11/11464322/2c77a23570b7/fpls-15-1452821-g001.jpg

相似文献

An improved YOLOv7 model based on Swin Transformer and Trident Pyramid Networks for accurate tomato detection.

Front Plant Sci. 2024 Sep 26;15:1452821. doi: 10.3389/fpls.2024.1452821. eCollection 2024.

TBC-YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection.

Front Plant Sci. 2023 Aug 17;14:1223410. doi: 10.3389/fpls.2023.1223410. eCollection 2023.

Small object detection algorithm incorporating swin transformer for tea buds.

PLoS One. 2024 Mar 21;19(3):e0299902. doi: 10.1371/journal.pone.0299902. eCollection 2024.

Weed detection and recognition in complex wheat fields based on an improved YOLOv7.

Front Plant Sci. 2024 Jun 24;15:1372237. doi: 10.3389/fpls.2024.1372237. eCollection 2024.

An efficient tomato-detection method based on improved YOLOv4-tiny model in complex environment.

Front Plant Sci. 2023 Apr 3;14:1150958. doi: 10.3389/fpls.2023.1150958. eCollection 2023.

A wheat spike detection method based on Transformer.

Front Plant Sci. 2022 Oct 20;13:1023924. doi: 10.3389/fpls.2022.1023924. eCollection 2022.

Revolutionizing tomato disease detection in complex environments.

Front Plant Sci. 2024 Sep 16;15:1409544. doi: 10.3389/fpls.2024.1409544. eCollection 2024.

FEA-Swin: Foreground Enhancement Attention Swin Transformer Network for Accurate UAV-Based Dense Object Detection.

Sensors (Basel). 2022 Sep 15;22(18):6993. doi: 10.3390/s22186993.

Tomato leaf disease detection based on attention mechanism and multi-scale feature fusion.

Front Plant Sci. 2024 Apr 9;15:1382802. doi: 10.3389/fpls.2024.1382802. eCollection 2024.

An occluded cherry tomato recognition model based on improved YOLOv7.

Front Plant Sci. 2023 Oct 20;14:1260808. doi: 10.3389/fpls.2023.1260808. eCollection 2023.

引用本文的文献

ASD-YOLO: a lightweight network for coffee fruit ripening detection in complex scenarios.

Front Plant Sci. 2025 Feb 10;16:1484784. doi: 10.3389/fpls.2025.1484784. eCollection 2025.

本文引用的文献

An accurate green fruits detection method based on optimized YOLOX-m.

Front Plant Sci. 2023 May 8;14:1187734. doi: 10.3389/fpls.2023.1187734. eCollection 2023.

An efficient tomato-detection method based on improved YOLOv4-tiny model in complex environment.

Front Plant Sci. 2023 Apr 3;14:1150958. doi: 10.3389/fpls.2023.1150958. eCollection 2023.

TomatoDet: Anchor-free detector for tomato detection.

Front Plant Sci. 2022 Aug 5;13:942875. doi: 10.3389/fpls.2022.942875. eCollection 2022.

YOLO-Tomato: A Robust Algorithm for Tomato Detection Based on YOLOv3.

Sensors (Basel). 2020 Apr 10;20(7):2145. doi: 10.3390/s20072145.

A Mature-Tomato Detection Algorithm Using Machine Learning and Color Analysis.

Sensors (Basel). 2019 Apr 30;19(9):2023. doi: 10.3390/s19092023.

A Robust Deep-Learning-Based Detector for Real-Time Tomato Plant Diseases and Pests Recognition.

Sensors (Basel). 2017 Sep 4;17(9):2022. doi: 10.3390/s17092022.

DeepFruits: A Fruit Detection System Using Deep Neural Networks.

Sensors (Basel). 2016 Aug 3;16(8):1222. doi: 10.3390/s16081222.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.

IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种基于Swin Transformer和三叉戟金字塔网络的改进YOLOv7模型，用于精确的番茄检测。

An improved YOLOv7 model based on Swin Transformer and Trident Pyramid Networks for accurate tomato detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献