Zhao Yun, Chen Yijia, Xu Xing, He Yong, Gan Hao, Wu Na, Wang Zhechen, Sun Xi, Wang Yali, Skobelev Petr, Mi Yanan
School of Artificial Intelligence and Information Engineering, Zhejiang University of Science and Technology, Hangzhou, China.
College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou, China.
Front Plant Sci. 2025 Jul 8;16:1618214. doi: 10.3389/fpls.2025.1618214. eCollection 2025.
Screening and cultivating healthy small tomatoes, along with accurately predicting their yields, are crucial for sustaining the economy of tomato industry. However, in field scenarios, counting small tomato fruits is often hindered by environmental factors such as leaf shading. To address this challenge, this study proposed the Ta-YOLO modeling framework, aimed at improving the efficiency and accuracy of small tomato fruit detection. We captured images of small tomatoes at various stages of ripeness in real-world settings and compiled them into datasets for training and testing the model. First, we utilized the Space-to-Depth module to efficiently leverage the implicit features of the images while ensuring a lightweight operation of the backbone network. Next, we developed a novel pyramid pooling module(DASPPF) to capture global information through average pooling, effectively reducing the impact of edge and background noise on detection. We also introduced an additional tiny target detection head alongside the original detection head, enabling multi-scale detection of small tomatoes. To further enhance the model's focus on relevant information and improve its ability to recognize small targets, we designed a multi-dimensional attention structure(CSAM) that generated feature maps with more valuable information. Finally, we proposed the EWDIoU bounding box loss function, which leveraged a 2D Gaussian distribution to enhance the model's accuracy and robustness. The experimental results showed that the number of parameters, FLOPs, and FPS of our designed Ta-YOLO were 10.58M, 14.4G, and 131.58, respectively, and its mean average precision(mAP) reached 84.4%. It can better realize the counting of tomatoes with different maturity levels, which helps to improve the efficiency of the small tomato production and planting process.
筛选和培育健康的小番茄,并准确预测其产量,对维持番茄产业的经济发展至关重要。然而,在田间场景中,小番茄果实的计数常常受到叶片遮挡等环境因素的阻碍。为应对这一挑战,本研究提出了Ta-YOLO建模框架,旨在提高小番茄果实检测的效率和准确性。我们在实际场景中拍摄了处于不同成熟阶段的小番茄图像,并将其整理成数据集用于训练和测试模型。首先,我们利用空间到深度模块有效地利用图像的隐含特征,同时确保骨干网络的轻量级操作。接下来,我们开发了一种新颖的金字塔池化模块(DASPPF),通过平均池化来捕获全局信息,有效减少边缘和背景噪声对检测的影响。我们还在原始检测头之外引入了一个额外的微小目标检测头,实现对小番茄的多尺度检测。为进一步增强模型对相关信息的关注并提高其识别小目标的能力,我们设计了一种多维度注意力结构(CSAM),生成具有更有价值信息的特征图。最后,我们提出了EWDIoU边界框损失函数,利用二维高斯分布提高模型的准确性和鲁棒性。实验结果表明,我们设计的Ta-YOLO的参数数量、FLOPs和FPS分别为1058万、144亿和131.58,其平均精度均值(mAP)达到84.4%。它能够更好地实现不同成熟度番茄的计数,有助于提高小番茄生产和种植过程的效率。