Suppr
超能文献

使用混合变压器模型和图像预处理技术提高食物识别准确率。

Enhancing food recognition accuracy using hybrid transformer models and image preprocessing techniques.

作者信息

Jagadesh B N, Mantena Srihari Varma, Sathe Asha P, Prabhakara Rao T, Lella Kranthi Kumar, Pabboju Shyam Sunder, Vatambeti Ramesh

机构信息

School of Computer Science and Engineering, VIT-AP University, Vijayawada, India.

Department of Computer Science and Engineering, SRKR Engineering College, Bhimavaram, 534204, India.

出版信息

Sci Rep. 2025 Feb 15;15(1):5591. doi: 10.1038/s41598-025-90244-4.

DOI:10.1038/s41598-025-90244-4

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11829996/

Abstract

This study presents a robust approach for continuous food recognition essential for nutritional research, leveraging advanced computer vision techniques. The proposed method integrates Mutually Guided Image Filtering (MuGIF) to enhance dataset quality and minimize noise, followed by feature extraction using the Visual Geometry Group (VGG) architecture for intricate visual analysis. A hybrid transformer model, combining Vision Transformer and Swin Transformer variants, is introduced to capitalize on their complementary strengths. Hyperparameter optimization is performed using the Improved Discrete Bat Algorithm (IDBA), resulting in a highly accurate and efficient classification system. Experimental results highlight the superior performance of the proposed model, achieving a classification accuracy of 99.83%, significantly outperforming existing methods. This study underscores the potential of hybrid transformer architectures and advanced preprocessing techniques in advancing food recognition systems, offering enhanced accuracy and efficiency for practical applications in dietary monitoring and personalized nutrition recommendations.

摘要

本研究提出了一种强大的方法，用于营养研究中至关重要的连续食物识别，该方法利用了先进的计算机视觉技术。所提出的方法集成了相互引导图像滤波（MuGIF）以提高数据集质量并最小化噪声，随后使用视觉几何组（VGG）架构进行特征提取以进行复杂的视觉分析。引入了一种结合视觉Transformer和Swin Transformer变体的混合Transformer模型，以利用它们的互补优势。使用改进的离散蝙蝠算法（IDBA）进行超参数优化，从而得到一个高度准确且高效的分类系统。实验结果突出了所提出模型的卓越性能，实现了99.83%的分类准确率，显著优于现有方法。本研究强调了混合Transformer架构和先进预处理技术在推进食物识别系统方面的潜力，为饮食监测和个性化营养推荐的实际应用提供了更高的准确性和效率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/792c/11829996/c034fe3dc108/41598_2025_90244_Fig1_HTML.jpg

相似文献

1

Enhancing food recognition accuracy using hybrid transformer models and image preprocessing techniques.

Sci Rep. 2025 Feb 15;15(1):5591. doi: 10.1038/s41598-025-90244-4.

2

Enhanced Pneumonia Detection in Chest X-Rays Using Hybrid Convolutional and Vision Transformer Networks.

Curr Med Imaging. 2025;21:e15734056326685. doi: 10.2174/0115734056326685250101113959.

3

An improved feature extraction algorithm for robust Swin Transformer model in high-dimensional medical image analysis.

Comput Biol Med. 2025 Apr;188:109822. doi: 10.1016/j.compbiomed.2025.109822. Epub 2025 Feb 20.

4

An Explainable CNN and Vision Transformer-Based Approach for Real-Time Food Recognition.

Nutrients. 2025 Jan 20;17(2):362. doi: 10.3390/nu17020362.

5

A multi-filter deep transfer learning framework for image-based autism spectrum disorder detection.

Sci Rep. 2025 Apr 24;15(1):14253. doi: 10.1038/s41598-025-97708-7.

6

Bald eagle-optimized transformer networks with temporal-spatial mid-level features for pancreatic tumor classification.

Biomed Phys Eng Express. 2025 Apr 23;11(3). doi: 10.1088/2057-1976/adcac9.

7

Nutritional composition analysis in food images: an innovative Swin Transformer approach.

Front Nutr. 2024 Oct 14;11:1454466. doi: 10.3389/fnut.2024.1454466. eCollection 2024.

8

Leveraging swin transformer with ensemble of deep learning model for cervical cancer screening using colposcopy images.

Sci Rep. 2025 Mar 6;15(1):7900. doi: 10.1038/s41598-025-90415-3.

9

Optimizing pulmonary chest x-ray classification with stacked feature ensemble and swin transformer integration.

Biomed Phys Eng Express. 2024 Nov 6;11(1). doi: 10.1088/2057-1976/ad8c46.

10

Accurate classification of glomerular diseases by hyperspectral imaging and transformer.

Comput Methods Programs Biomed. 2024 Sep;254:108285. doi: 10.1016/j.cmpb.2024.108285. Epub 2024 Jun 11.

本文引用的文献

1

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer.

Cancers (Basel). 2024 Feb 29;16(5):987. doi: 10.3390/cancers16050987.

2

Multi-Spectral Food Classification and Caloric Estimation Using Predicted Images.

Foods. 2024 Feb 11;13(4):551. doi: 10.3390/foods13040551.

3

A Lightweight Hybrid Model with Location-Preserving ViT for Efficient Food Recognition.

Nutrients. 2024 Jan 8;16(2):200. doi: 10.3390/nu16020200.

4

Classification of Food Additives Using UV Spectroscopy and One-Dimensional Convolutional Neural Network.

Sensors (Basel). 2023 Aug 30;23(17):7517. doi: 10.3390/s23177517.

5

Deep Learning-Based Near-Infrared Hyperspectral Imaging for Food Nutrition Estimation.

Foods. 2023 Aug 22;12(17):3145. doi: 10.3390/foods12173145.

6

Large Scale Visual Food Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Aug;45(8):9932-9949. doi: 10.1109/TPAMI.2023.3237871. Epub 2023 Jun 30.

7

Health to Eat: A Smart Plate with Food Recognition, Classification, and Weight Measurement for Type-2 Diabetic Mellitus Patients' Nutrition Control.

Sensors (Basel). 2023 Feb 2;23(3):1656. doi: 10.3390/s23031656.

8

Eliminate the hardware: Mobile terminals-oriented food recognition and weight estimation system.

Front Nutr. 2022 Nov 16;9:965801. doi: 10.3389/fnut.2022.965801. eCollection 2022.

9

Improved Classification Approach for Fruits and Vegetables Freshness Based on Deep Learning.

Sensors (Basel). 2022 Oct 26;22(21):8192. doi: 10.3390/s22218192.

10

iHearken: Chewing sound signal analysis based food intake recognition system using Bi-LSTM softmax network.

Comput Methods Programs Biomed. 2022 Jun;221:106843. doi: 10.1016/j.cmpb.2022.106843. Epub 2022 May 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。