Suppr超能文献

一种带有机器视觉系统的深度学习模型,用于在食物食用过程中识别食物类型。

A deep learning model with machine vision system for recognizing type of the food during the food consumption.

作者信息

Bohlol Pouya, Hosseinpour Soleiman, Firouz Mahmoud Soltani

机构信息

Department of Agricultural Machinery Engineering, Faculty of Agricultural Engineering, University of Tehran, Karaj, Iran.

出版信息

Sci Rep. 2025 Aug 13;15(1):29734. doi: 10.1038/s41598-025-15755-6.

Abstract

The food industry prioritizes quality control and product knowledge, emphasizing factors like quantity, freshness, and color. This research addresses Sustainable Development Goals (SDGs) focused on controlling food consumption, promoting health, reducing energy usage, and minimizing environmental impact. The primary objective was to utilize machine vision and deep learning to identify consumed food products. The study categorizes food into 32 classes, divided into three main categories, and includes the documentation of images and videos captured during consumption across various situations. Initially, the dataset comprised 12,000 images in 16 classes and 24,000 images in 32 classes, which were subsequently augmented to yield 60,000 and 120,000 images, respectively. The augmented datasets were then processed through nine popular deep learning architectures, identifying ResNet50, EfficientNetB5, B6, and B7 as the most effective architectures. An essential step involved updating hyperparameters, including image size, batch size, learning rate, and optimizer settings, to enhance convergence rates and accuracy. The EfficientNetB7 model was adapted for further testing and compared against two prominent optimizers, Adam and Lion. Ultimately, the EfficientNetB7 model with the Lion optimizer was chosen for the dataset. The results of this deep learning algorithm demonstrated remarkable performance, achieving 100% accuracy in identifying images of food-consumed products within 16 classes when using EfficientNetB7 and the Lion optimizer. For the 32-class case, the accuracy reached 99%, with the mean absolute error (MAE), mean squared error (MSE), and root mean squared error (RMSE) recorded at 0.0079, 0.035, and 0.18, respectively. These findings illustrate the robustness of the adjusted dataset in alignment with the designed deep learning architecture.

摘要

食品行业将质量控制和产品知识放在首位,强调数量、新鲜度和颜色等因素。本研究涉及可持续发展目标(SDGs),重点是控制食品消费、促进健康、减少能源使用以及将环境影响降至最低。主要目标是利用机器视觉和深度学习来识别已消费的食品。该研究将食品分为32类,分为三个主要类别,并包括记录在各种情况下消费过程中拍摄的图像和视频。最初,数据集包括16类中的12000张图像和32类中的24000张图像,随后进行了扩充,分别产生了60000张和120000张图像。然后,扩充后的数据集通过九种流行的深度学习架构进行处理,确定ResNet50、EfficientNetB5、B6和B7为最有效的架构。一个关键步骤是更新超参数,包括图像大小、批量大小、学习率和优化器设置,以提高收敛速度和准确性。EfficientNetB7模型经过调整用于进一步测试,并与两种著名的优化器Adam和Lion进行比较。最终,为数据集选择了使用Lion优化器的EfficientNetB7模型。这种深度学习算法的结果显示出卓越的性能,在使用EfficientNetB7和Lion优化器时,在识别16类已消费食品的图像方面达到了100%的准确率。对于32类的情况,准确率达到99%,平均绝对误差(MAE)、均方误差(MSE)和均方根误差(RMSE)分别记录为0.0079、0.035和0.18。这些发现说明了调整后的数据集与设计的深度学习架构的匹配程度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4e15/12350950/3011f01c4f22/41598_2025_15755_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验