Kim Tae Hyong, Kim Ah-Na
Smart Manufacturing Research Group, Korea Food Research Institute, 245, Nongsaengmyeong-ro, Iseo-myeon, Wanju-gun, 55365, Jeollabuk-do, Republic of Korea.
Sci Rep. 2025 Aug 28;15(1):31732. doi: 10.1038/s41598-025-16563-8.
In Kimbugak manufacturing system, sorting and loading before frying are still performed manually, which imposes a heavy workload on workers and limits scalability. This study focuses on detecting and classifying the physical characteristics of dried laver bugak to enable robotic pick-and-place operations by developing a You Only Look Once (YOLO) and Real-time Detection Transformer (RT-DETR) deep learning detection model based on fused RGB and infrared (IR) images for integration into a robotic automation system. Through experiments, it was found that at least five physical classes are needed for effective robotic handling. A novel approach combining RGB and IR image fusion using the Visual Geometry Group 19-layer (VGG19) network is introduced to enhance the input quality for detection. Experimental results show that the YOLOv11l model significantly outperforms the YOLOv8s model, achieving an F1 score of 0.94 and a mean Average Precision at 0.5 (mAP@0.5) of 0.95. These results demonstrate that VGG-based image fusion with YOLOv11l is an optimal solution for classifying and locating dried laver bugak. This research highlights the importance of physical class definition, multimodal image fusion, and detector selection in developing an effective automated sorting and loading system for Kimbugak production.
在紫菜包饭制造系统中,油炸前的分拣和装盘仍需人工完成,这给工人带来了沉重的工作量,并限制了生产规模的扩大。本研究旨在通过开发一种基于融合RGB和红外(IR)图像的You Only Look Once(YOLO)和实时检测变压器(RT-DETR)深度学习检测模型,对干紫菜包饭的物理特征进行检测和分类,以便将其集成到机器人自动化系统中实现机器人抓取和放置操作。通过实验发现,为了实现有效的机器人操作,至少需要五个物理类别。引入了一种使用视觉几何组19层(VGG19)网络进行RGB和IR图像融合的新方法,以提高检测的输入质量。实验结果表明,YOLOv11l模型明显优于YOLOv8s模型,F1分数达到0.94,0.5的平均精度均值(mAP@0.5)为0.95。这些结果表明,基于VGG的图像融合与YOLOv11l是对干紫菜包饭进行分类和定位的最佳解决方案。本研究强调了在开发有效的紫菜包饭生产自动分拣和装盘系统中,物理类别定义、多模态图像融合和检测器选择的重要性。