Mao Liang, Guo Zihao, Liu Mingzhe, Li Yue, Wang Linlin, Li Jie
Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence Application Technology Research Institute, Shenzhen Polytechnic University, Shenzhen, China.
School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, China.
Front Neurorobot. 2025 Feb 7;18:1518878. doi: 10.3389/fnbot.2024.1518878. eCollection 2024.
To enhance the detection of litchi fruits in natural scenes, address challenges such as dense occlusion and small target identification, this paper proposes a novel multimodal target detection method, denoted as YOLOv5-Litchi.
Initially, the Neck layer network of YOLOv5s is simplified by changing its FPN+PAN structure to an FPN structure and increasing the number of detection heads from 3 to 5. Additionally, the detection heads with resolutions of 80 × 80 pixels and 160 × 160 pixels are replaced by TSCD detection heads to enhance the model's ability to detect small targets. Subsequently, the positioning loss function is replaced with the EIoU loss function, and the confidence loss is substituted by VFLoss to further improve the accuracy of the detection bounding box and reduce the missed detection rate in occluded targets. A sliding slice method is then employed to predict image targets, thereby reducing the miss rate of small targets.
Experimental results demonstrate that the proposed model improves accuracy, recall, and mean average precision (mAP) by 9.5, 0.9, and 12.3 percentage points, respectively, compared to the original YOLOv5s model. When benchmarked against other models such as YOLOx, YOLOv6, and YOLOv8, the proposed model's AP value increases by 4.0, 6.3, and 3.7 percentage points, respectively.
The improved network exhibits distinct improvements, primarily focusing on enhancing the recall rate and AP value, thereby reducing the missed detection rate which exhibiting a reduced number of missed targets and a more accurate prediction frame, indicating its suitability for litchi fruit detection. Therefore, this method significantly enhances the detection accuracy of mature litchi fruits and effectively addresses the challenges of dense occlusion and small target detection, providing crucial technical support for subsequent litchi yield estimation.
为了增强自然场景下荔枝果实的检测能力,应对诸如密集遮挡和小目标识别等挑战,本文提出了一种新颖的多模态目标检测方法,称为YOLOv5-Litchi。
首先,通过将YOLOv5s的Neck层网络的FPN+PAN结构改为FPN结构,并将检测头数量从3个增加到5个来简化该网络。此外,将分辨率为80×80像素和160×160像素的检测头替换为TSCD检测头,以增强模型检测小目标的能力。随后,将定位损失函数替换为EIoU损失函数,并将置信度损失替换为VFLoss,以进一步提高检测边界框的准确性并降低遮挡目标的漏检率。然后采用滑动切片方法预测图像目标,从而降低小目标的漏检率。
实验结果表明,与原始YOLOv5s模型相比,所提出的模型在准确率、召回率和平均精度均值(mAP)上分别提高了9.5、0.9和12.3个百分点。与YOLOx、YOLOv6和YOLOv8等其他模型进行基准测试时,所提出模型的AP值分别提高了4.0、6.3和3.7个百分点。
改进后的网络有显著提升,主要集中在提高召回率和AP值,从而降低漏检率,漏检目标数量减少且预测框更准确,表明其适用于荔枝果实检测。因此,该方法显著提高了成熟荔枝果实的检测精度,有效应对了密集遮挡和小目标检测的挑战,为后续荔枝产量估计提供了关键技术支持。